Inferring Attention Shifts for Salient Instance Ranking

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

View graph of relations


Related Research Unit(s)


Original languageEnglish
Pages (from-to)964–986
Number of pages23
Journal / PublicationInternational Journal of Computer Vision
Issue number3
Online published18 Oct 2023
Publication statusPublished - Mar 2024



The human visual system has limited capacity in simultaneously processing multiple visual inputs. Consequently, humans rely on shifting their attention from one location to another. When viewing an image of complex scenes, psychology studies and behavioural observations show that humans prioritise and sequentially shift attention among multiple visual stimuli. In this paper, we propose to predict the saliency rank of multiple objects by inferring human attention shift. We first construct a new large-scale salient object ranking dataset, with the saliency rank of objects defined by the order that an observer attends to these objects via attention shift. We then propose a new deep learning-based model to leverage both bottom-up and top-down attention mechanisms for saliency rank prediction. Our model includes three novel modules: Spatial Mask Module (SMM), Selective Attention Module (SAM) and Salient Instance Edge Module (SIEM). SMM integrates bottom-up and semantic object properties to enhance contextual object features, from which SAM learns the dependencies between object features and image features for saliency reasoning. SIEM is designed to improve segmentation of salient objects, which helps further improve their rank predictions. Experimental results show that our proposed network achieves state-of-the-art performances on the salient object ranking task across multiple datasets. Code and data are available at © 2023, The Author(s).

Research Area(s)

  • Attention shift, Saliency, Saliency ranking, Salient object detection

Citation Format(s)

Inferring Attention Shifts for Salient Instance Ranking. / Siris, Avishek; Jiao, Jianbo; Tam, Gary K.L. et al.
In: International Journal of Computer Vision, Vol. 132, No. 3, 03.2024, p. 964–986.

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

Download Statistics

No data available