TY - GEN
T1 - How much time do you have? Modeling multi-duration saliency
AU - Fosco, Camilo
AU - Newman, Anelise
AU - Sukhum, Pat
AU - Zhang, Yun Bin
AU - Zhao, Nanxuan
AU - Oliva, Aude
AU - Bylinskii, Zoya
PY - 2020/6
Y1 - 2020/6
N2 - What jumps out in a single glance of an image is different than what you might notice after closer inspection. Yet conventional models of visual saliency produce predictions at an arbitrary, fixed viewing duration, offering a limited view of the rich interactions between image content and gaze location. In this paper we propose to capture gaze as a series of snapshots, by generating population-level saliency heatmaps for multiple viewing durations. We collect the CodeCharts1K dataset, which contains multiple distinct heatmaps per image corresponding to 0.5, 3, and 5 seconds of free-viewing. We develop an LSTM-based model of saliency that simultaneously trains on data from multiple viewing durations. Our Multi-Duration Saliency Excited Model (MD-SEM) achieves competitive performance on the LSUN 2017 Challenge with 57% fewer parameters than comparable architectures. It is the first model that produces heatmaps at multiple viewing durations, enabling applications where multi-duration saliency can be used to prioritize visual content to keep, transmit, and render.
AB - What jumps out in a single glance of an image is different than what you might notice after closer inspection. Yet conventional models of visual saliency produce predictions at an arbitrary, fixed viewing duration, offering a limited view of the rich interactions between image content and gaze location. In this paper we propose to capture gaze as a series of snapshots, by generating population-level saliency heatmaps for multiple viewing durations. We collect the CodeCharts1K dataset, which contains multiple distinct heatmaps per image corresponding to 0.5, 3, and 5 seconds of free-viewing. We develop an LSTM-based model of saliency that simultaneously trains on data from multiple viewing durations. Our Multi-Duration Saliency Excited Model (MD-SEM) achieves competitive performance on the LSUN 2017 Challenge with 57% fewer parameters than comparable architectures. It is the first model that produces heatmaps at multiple viewing durations, enabling applications where multi-duration saliency can be used to prioritize visual content to keep, transmit, and render.
UR - http://www.scopus.com/inward/record.url?scp=85094810301&partnerID=8YFLogxK
UR - https://www.scopus.com/record/pubmetrics.uri?eid=2-s2.0-85094810301&origin=recordpage
U2 - 10.1109/CVPR42600.2020.00453
DO - 10.1109/CVPR42600.2020.00453
M3 - RGC 32 - Refereed conference paper (with host publication)
T3 - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
SP - 4472
EP - 4481
BT - Proceedings - 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2020)
PB - IEEE
T2 - 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020
Y2 - 14 June 2020 through 19 June 2020
ER -