SeqRank: Sequential Ranking of Salient Objects

Huankang Guan, Rynson W.H. Lau*

*Corresponding author for this work

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

4 Citations (Scopus)

Abstract

Salient Object Ranking (SOR) is the process of predicting the order of an observer's attention to objects when viewing a complex scene. Existing SOR methods primarily focus on ranking various scene objects simultaneously by exploring their spatial and semantic properties. However, their solutions of simultaneously ranking all salient objects do not align with human viewing behavior, and may result in incorrect attention shift predictions. We observe that humans view a scene through a sequential and continuous process involving a cycle of foveating to objects of interest with our foveal vision while using peripheral vision to prepare for the next fixation location. For instance, when we see a flying kite, our foveal vision captures the kite itself, while our peripheral vision can help us locate the person controlling it such that we can smoothly divert our attention to it next. By repeatedly carrying out this cycle, we can gain a thorough understanding of the entire scene. Based on this observation, we propose to model the dynamic interplay between foveal and peripheral vision to predict human attention shifts sequentially. To this end, we propose a novel SOR model, SeqRank, which reproduces foveal vision to extract high-acuity visual features for accurate salient instance segmentation while also modeling peripheral vision to select the object that is likely to grab the viewer’s attention next. By incorporating both types of vision, our model can mimic human viewing behavior better and provide a more faithful ranking among various scene objects. Most notably, our model improves the SA-SOR/MAE scores by +6.1%/-13.0% on IRSR, compared with the state-of-the-art. Extensive experiments show the superior performance of our model on the SOR benchmarks. Code is available at https://github.com/guanhuankang/SeqRank.

© 2024, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
Original languageEnglish
Title of host publicationProceedings of the 38th AAAI Conference on Artificial Intelligence
EditorsJennifer Dy, Sriraam Natarajan, Michael Wooldridge
PublisherAAAI Press
Pages1941-1949
Number of pages9
ISBN (Print)1-57735-887-2, 978-1-57735-887-9
DOIs
Publication statusPublished - 2024
Event38th Association for the Advancement of Artificial Intelligence Conference on Artificial Intelligence (AAAI-24) - Vancouver Convention Center, Vancouver, Canada
Duration: 20 Feb 202427 Feb 2024
https://aaai.org/aaai-conference/
https://ojs.aaai.org/index.php/AAAI/issue/archive

Publication series

NameProceedings of the AAAI Conference on Artificial Intelligence
Number3
Volume38
ISSN (Print)2159-5399
ISSN (Electronic)2374-3468

Conference

Conference38th Association for the Advancement of Artificial Intelligence Conference on Artificial Intelligence (AAAI-24)
Country/TerritoryCanada
CityVancouver
Period20/02/2427/02/24
Internet address

Bibliographical note

Information for this record is supplemented by the author(s) concerned.

Research Keywords

  • Low level vision
  • salient object detection (SOD)
  • salient object ranking (SOR)

Fingerprint

Dive into the research topics of 'SeqRank: Sequential Ranking of Salient Objects'. Together they form a unique fingerprint.

Cite this