Real-Time Video Saliency Prediction Via 3D Residual Convolutional Neural Network

Research output: Journal Publications and Reviews (RGC: 21, 22, 62)21_Publication in refereed journalpeer-review

5 Scopus Citations
View graph of relations

Author(s)

Related Research Unit(s)

Detail(s)

Original languageEnglish
Article number8863376
Pages (from-to)147743-147754
Journal / PublicationIEEE Access
Volume7
Online published9 Oct 2019
Publication statusPublished - 2019

Link(s)

Abstract

Attention is a fundamental attribute of human visual system that plays important roles in many visual perception tasks. The key issue of video saliency lies in how to efficiently exploit the temporal information. Instead of singling out the temporal saliency maps, we propose a real-time end-to-end video saliency prediction model via 3D residual convolutional neural network (3D-ResNet), which incorporates the prediction of spatial and temporal saliency maps into one single process. In particular, a multi-scale feature representation scheme is employed to further boost the model performance. Besides, a frame skipping strategy is proposed for speeding up the saliency map inference process. Moreover, a new challenging eye tracking database with 220 video clips is established to facilitate the research of video saliency prediction. Extensive experimental results show our model outperforms the state-of-the-art methods over the eye fixation datasets in terms of both prediction accuracy and inference speed.

Research Area(s)

  • 3D residual convolutional neural network, eye fixation dataset, Video saliency prediction

Citation Format(s)

Real-Time Video Saliency Prediction Via 3D Residual Convolutional Neural Network. / SUN, Zhenhao; WANG, Xu; ZHANG, Qiudan; JIANG, Jianmin.

In: IEEE Access, Vol. 7, 8863376, 2019, p. 147743-147754.

Research output: Journal Publications and Reviews (RGC: 21, 22, 62)21_Publication in refereed journalpeer-review

Download Statistics

No data available