Fusing heterogeneous modalities for video and image re-ranking

Hung-Khoon Tan, Chong-Wah Ngo

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

14 Citations (Scopus)

Abstract

Multimedia documents in popular image and video sharing websites such as Flickr and Youtube are heterogeneous documents with diverse ways of representations and rich user-supplied information. In this paper, we investigate how the agreement among heterogeneous modalities can be exploited to guide data fusion. The problem of fusion is cast as the simultaneous mining of agreement from different modalities and adaptation of fusion weights to construct a fused graph from these modalities. An iterative framework based on agreement-fusion optimization is thus proposed. We plug in two well-known algorithms: random walk and semi-supervised learning to this framework to illustrate the idea of how agreement (conflict) is incorporated (compromised) in the case of uniform and adaptive fusion. Experimental results on web video and image re-ranking demonstrate that, by proper fusion strategy rather than simple linear fusion, performance improvement on search can generally be expected. © 2011 ACM.
Original languageEnglish
Title of host publicationProceedings of the 1st ACM International Conference on Multimedia Retrieval, ICMR'11
DOIs
Publication statusPublished - 2011
Event1st ACM International Conference on Multimedia Retrieval, ICMR'11 - Trento, Italy
Duration: 18 Apr 201120 Apr 2011

Conference

Conference1st ACM International Conference on Multimedia Retrieval, ICMR'11
PlaceItaly
CityTrento
Period18/04/1120/04/11

Research Keywords

  • graph fusion
  • heterogeneous modality fusion
  • modality agreement
  • re-ranking

Fingerprint

Dive into the research topics of 'Fusing heterogeneous modalities for video and image re-ranking'. Together they form a unique fingerprint.

Cite this