Neighbours Matter : Image Captioning with Similar Images

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

2 Scopus Citations
View graph of relations

Author(s)

  • Siyu Huang
  • Haoyi Xiong
  • Xingjian Li
  • Dejing Dou

Related Research Unit(s)

Detail(s)

Original languageEnglish
Title of host publication31st British Machine Vision Conference, BMVC 2020
PublisherBritish Machine Vision Association, BMVA
Number of pages14
Publication statusPublished - Sept 2020

Publication series

NameBritish Machine Vision Conference, BMVC

Conference

Title31st British Machine Vision Conference (BMVC 2020)
LocationVirtual
Period7 - 10 September 2020

Abstract

Most image captioning models aim to generate captions based solely on the input image. However images that are similar to the given input image contain variations of the same or similar concepts as the input image. Thus, aggregating information over similar images could be used to improve image captioning models, by strengthening or inferring concepts that are in the input image. In this paper, we propose an image captioning model based on KNN graphs composed of the input image and its similar images, where each node denotes an image or a caption. An attention-in-attention (AiA) model is developed to refine the node representations. Using the refined features significantly improves the baseline performance, e.g., CIDEr score obtained by Updown model increases from 120.1 to 125.6. Compared with the state-of-the-art performance, our proposed method obtains 129.3 of CIDEr and 22.6 of SPICE on Karpathy's test split, which is competitive with the models that employ fine-grained image features such as scene graphs and image parsing trees. © 2020. The copyright of this document resides with its authors. It may be distributed unchanged freely in print or electronic forms.

Citation Format(s)

Neighbours Matter: Image Captioning with Similar Images. / Wang, Qingzhong; Wang, Jiuniu; Chan, Antoni B. et al.
31st British Machine Vision Conference, BMVC 2020. British Machine Vision Association, BMVA, 2020. (British Machine Vision Conference, BMVC).

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review