DML-iTrack-3D

 

Benchmark Eye-Tracking Dataset for Visual Saliency Prediction on Stereoscopic 3D Video

 

 

 

3D Visual Attention Models (3D-VAM): 3D-VAMs attempt to predict the locations in 3D videos which are likely to attract the attention.

 

DML-iTrack-3D: DML-iTrack-3D is an eye-tracking dataset of stereoscopic 3D videos available for the research community to use for non-commercial purposes. Contact ‘dehkordi [at] ece [dot] ubc [dot] ca’ if you intend to use the dataset for commercial purposes.

 

DML-iTrack-3D dataset contains the eye-tracking data of 61 stereoscopic 3D sequences collected from 24 subjects. 27 videos are marked as the "training" videos and their corresponding fixation maps are available to download. The rest of the videos (34) are marked as the "validation" set and are used for performance evaluations. The fixation maps of the validation set is not made publicly available to conduct a fair comparison among the performance of various visual attention models:

 

Description: Description: Description: Description: Description: http://ece.ubc.ca/%7Edehkordi/images/snapshots.png

 

Training Sequences:    Videos (YUV 4:2:0) 45.3 GB     Depth Maps 22.6 GB    

Fixation Maps:   2D 22.6 GB   3D 22.6 GB

Validation Sequences:    Videos (YUV 4:2:0) 58.0 GB    Depth Maps 29.0 GB

Fixation Maps:   2D 29.0 GB   3D 29.0 GB

 

Our Benchmark dataset: Performance of different 2D and 3D visual attention models is evaluated over the DML-iTrack-3D dataset. Various saliency metrics are adopted. Below, find the performance evaluation results. To rank the VAMs based on their accuracy, different models are ranked according to each metric separately and the average of the ranks is used to sort these methods (we use only sAUC, KLD, and NSS for ranking). Details regarding our dataset and performance evaluations can be found in the following paper:

 

    * Amin Banitalebi-Dehkordi; Eleni Nasiopoulos; Mahsa T. Pourazad; Panos Nasiopoulos. Benchmark three-dimensional eye-tracking dataset for visual saliency prediction on stereoscopic three-dimensional video. J. Electron. Imaging. 25 (1), 013008 (January 14, 2016); doi: 10.1117/1.JEI.25.1.013008.

 

Submit New VAMs: Send us a link to download the results of your saliency prediction algorithms over the validation part of the DML-iTrack-3D dataset. We will evaluate the performance of your VAM and add your model to the benchmark. Note that your saliency maps should be one YUV 4:2:0 video file per each video. Contact us if you have any questions.

 

LBVS-3D: Learning-Based Visual Saliency prediction for stereoscopic 3D video: Our proposed saliency prediction method for stereoscopic 3D video. Details can be found in the following paper:

 

    * A. Banitalebi-Dehkordi, M.T. Pourazad, and P. Nasiopoulos, "A Learning-Based Visual Saliency prediction model for stereoscopic 3D video (LBVS-3D)," Multimedia Tools and Applications, 2016, DOI 10.1007/s11042-016- 4155-y.

 

    * Code: available at: http://ece.ubc.ca/~dehkordi/code.html#LBVS-3D

 

 

 

 

Performance evaluation of 2D VAMs when compared against the 2D & 3D eye-tracking FDMs:

 

Performance improvements when depth maps are integrated to 2D VAMs for saliency prediction on 3D video

 

 

References:

 

[1] A. Banitalebi-Dehkordi, M. T. Pourazad, and Panos Nasiopoulos, “LBVS-3D: Learning-Based Visual Saliency prediction for stereoscopic 3D video,” Submitted.

[2] T. Judd, K. Ehinger, F. Durand, and A. Torralba, “Learning to predict where humans look,” ICCV 2009.

[3] E. Erdem and A. Erdem, “Visual saliency estimation by nonlinearly integrating features using region covariances,” Journal of Vision, 2013.

[4] L. Duan, Ch. Wu, J. Miao, L. Qing, and Y. Fu, “Visual saliency detection by spatially weighted dissimilarity,” CVPR 2011.

[5] N. Riche, M. Mancas, M. Duvinage, M. Mibulumukini, B. Gosselin, and T. Dutoit, “RARE2012: A multi-scale rarity-based saliency detection with its comparative statistical analysis,” Signal Processing: Image Communication, 2013.

[6] R. Margolin, L. Zelnik-Manor, and A. Tal. “What makes a patch distinct,” CVPR 2013.

[7] J. Zhang and S. Sclaroff, “Saliency detection: a boolean map approach,” ICCV 2013.

[8] B. Schauerte and R. Stiefelhagen, “Quaternion-based spectral saliency detection for eye fixation prediction,” ECCV 2012.

[9] Y. Fang, Z. Chen, W. Lin, C.-W. Lin, “Saliency detection in the compressed domain for adaptive image retargeting,” IEEE Transactions on Image Processing, vol. 21, no. 9, pp. 3888-3901, 2012.

[10] Y. Fang, J. Wang, M. Narwaria, P.L. Callet, and W. Lin, “Saliency detection for stereoscopic images,” IEEE Transactions on Image Processing, vol. 23, no. 6, pp. 2625-2636, Feb. 2014.

[11] J. Harel, C. Koch, and P. Perona, “Graph-Based Visual Saliency,” NIPS 2006.

[12] R. Margolin, L. Zelnik-Manor, and A. Tal, “Saliency for image manipulation,” CGI 2012.

[13] L. Itti, C. Koch, and E. Niebur, “A model of saliency-based visual attention for rapid scene analysis,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 20, No. 11, pp. 1254-1259, Nov 1998.

[14] C. Kocberber and A.A. Salah, “Video retargeting: video saliency and optical flow based hybrid approach,” AAAI 2014.

[15] M.-M. Cheng, G.-X. Zhang, N.J. Mitra, X. Huang, and S.-M. Hu, “Global Contrast based Salient Region Detection,” IEEE CVPR, p. 409-416, 2011.

[16] H.R. Tavakoli, E. Rahtu, and J. Heikkila, “Fast and efficient saliency detection using sparse sampling and kernel density estimation,” SCIA 2011.

[17] T. Yubing, F.A. Cheikh, F.F.E. Guraya, H. Konik, A. Tremeau, “A spatiotemporal saliency model for video surveillance,” Cognitive Computation, vol. 3, pp. 241-263, 2011.

[18] N. Bruce, J. Tsotsos, “Attention based on information maximization,” Journal of Vision, 2007.

[19] J. Kim, D. Han, Y-W. Tai, and J. Kim, “Salient region detection via high-dimensional color transform,” CVPR 2014.

[20] S. Goferman, L. Zelnik-Manor, and A. Tal, “Context-aware saliency detection,” CVPR 2010.

[21] Ch. Yang, L. Zhang, H. Lu, M-H. Yang, and X. Ruan, “Saliency detection via graph-based manifold ranking,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2013), Portland, June, 2013.

[22] Q. Yan, L. Xu, J. Shi, and J. Jia, “Hierarchical saliency detection,” CVPR 2013.

[23] H.J. Seo and P. Milanfar. “Static and space-time visual saliency detection by self-resemblance,” Journal of Vision, 2012.

[24] T. Judd, K. Ehinger, F. Durand, and A. Torralba, “Learning to predict where humans look,” ICCV 2009.

[25] T.N. Vikram, M. Tscherepanow, and B. Wrede, “A saliency map based on sampling an image into random rectangular regions of interest,” Pattern Recognition, 2012.

[26]L. Itti, C. Koch, and E. Niebur, “A model of saliency-based visual attention for rapid scene analysis,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 20 No. 11, pp. 1254-1259, Nov 1998.

[27] L. Zhang, Zh. Gu, and H. Li, “SDSP: a novel saliency detection method by combining simple priors,” ICIP, 2013.

[28] S-h. Zhong, Y. Liu, F. Ren, J. Zhang, and T. Ren, “Video saliency detection via dynamic consistent spatio-temporal attention modelling,” 27th AAAI Conference on Artificial Intelligence, 2013.

[29] E. Rahtu, J. Kannala, M. Salo, and J. Heikkilä, “Segmenting salient objects from images and Videos,” European Conference on Computer Vision, ECCV 2010.

[30] N. Murray, M. Vanrell, X. Otazu, and C.A. Parraga, “Saliency estimation using a non-parametric vision model,” CVPR 2011.

[31] L. Coria, D. Xu, and P. Nasiopoulos, “Automatic stereoscopic 3D video reframing," in Proc. of 3DTV-Conference: The True Vision - Capture, Transmission and Display of 3D Video (3DTV-CON), 2012 , ETH Zurich, Oct. 15-17, 2012, pp. 1-4.

[32] H.J. Seo and P. Milanfar. “Static and space-time visual saliency detection by self-resemblance,” Journal of Vision, 2012.

[33] Y. Xie, H. Lu, and M-H. Yang, Bayesian saliency via low and mid level cues,” IEEE Transactions on Image Processing (TIP), vol. 22, no. 5, pp. 1689-1698, 2013.

[34] X. Hou and L. Zhang, “Saliency detection: a spectral residual approach,” IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2007, pp. 1-8.

[35] R. Achanta, S. Hemami, F. Estrada, and S. Susstrunk,“Frequency-tuned salient region detection,” in IEEE CVPR, 2009, pp. 1597-1604.

[36] J. Wang, M.P.D. Silva, P.L. Callet, and V. Ricordel, “Computational model of stereoscopic 3D visual saliency,” IEEE Transactions on Image Processing, vol. 22, no. 6, June 2013.

[37] Y. Zhai and M. Shah, “Visual attention detection in video sequences using spatiotemporal cues,” in ACM Multimedia, 2006, pp. 815-824.

[38] A. Torralba, A. Oliva, M.S. Castelhano, and J.M. Henderson, “Contextual guidance of attention in natural scenes: The role of global features on object search,” Psychological Review, 2006.

[39] N. Ouerhani and H. Hugli, “Computing visual attention from scene depth,” in Proc. IEEE 15th Int. Conf. Pattern Recognition., vol. 1, Sep. 2000, pp. 375-378.

[40] X. Fan, Z. Liu, and G. Sun, “Salient region detection for stereoscopic images,” 19th International Conference on Digital Signal Processing, 2014, Hong Kong.

[41] M.-M. Cheng, G.-X. Zhang, N.J. Mitra, X. Huang, and S.-M. Hu, “Global Contrast based Salient Region Detection,” IEEE CVPR, p. 409-416, 2011.

[42] Y. Niu, Y. Geng, X. Li, and F. Liu, “Leveraging stereopsis for saliency analysis,” CVPR 2012.

[43] Y. Fang, W. Lin, B-S. Lee, C.T. Lau, Zh. Chen, Ch-W. Lin, "Bottom-up saliency detection model based on human visual sensitivity and amplitude spectrum," IEEE Transactions on Multimedia 14(1): 187-198 (2012).

[44] Y. Zhai and M. Shah, “Visual attention detection in video sequences using spatiotemporal cues,” in ACM Multimedia, 2006, pp. 815-824.

[45] Y-F. Ma, H-J. Zhang, “A model of motion attention for video skimming,” IEEE ICIP 2002.

[46] H.J. Seo and P. Milanfar. “Static and space-time visual saliency detection by self-resemblance,” Journal of Vision, 2012.

[47] Q. Jiang, F. Duan, and F. Shao, “3D Visual attention for stereoscopic image quality assessment,” Journal of Software, vol. 9, no. 7, July 2014.

[48] R. Ju, L. Ge, W. Geng, T. Ren, and G. Wu, “Depth saliency based on anisotropic center-surround difference,” ICIP 2014.

[49] C. Chamaret, S. Godeffroy, P. Lopez, and O. L. Meur, “Adaptive 3D rendering based on region-of-interest,” in Proc. SPIE, Feb. 2010.

[50] Y. Zhang, G. Jiang, M. Yu, and K. Chen, “Stereoscopic visual attention model for 3D video,” Advances in Multimedia Modeling, New York, USA: Springer-Verlag, 2010, pp.314-324.