Interpretable Embedding for Ad-Hoc Video Search

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

25 Scopus Citations
View graph of relations

Related Research Unit(s)

Detail(s)

Original languageEnglish
Title of host publicationMM '20 - Proceedings of the 28th ACM International Conference on Multimedia
PublisherAssociation for Computing Machinery, Inc
Pages3357-3366
ISBN (electronic)9781450379885
Publication statusPublished - Oct 2020

Publication series

NameMM - Proceedings of the ACM International Conference on Multimedia

Conference

Title28th ACM International Conference on Multimedia (MM 2020)
LocationVirtual
PlaceUnited States
CitySeattle
Period12 - 16 October 2020

Abstract

Answering query with semantic concepts has long been the mainstream approach for video search. Until recently, its performance is surpassed by concept-free approach, which embeds queries in a joint space as videos. Nevertheless, the embedded features as well as search results are not interpretable, hindering subsequent steps in video browsing and query reformulation. This paper integrates feature embedding and concept interpretation into a neural network for unified dual-task learning. In this way, an embedding is associated with a list of semantic concepts as an interpretation of video content. This paper empirically demonstrates that, by using either the embedding features or concepts, considerable search improvement is attainable on TRECVid benchmarked datasets. Concepts are not only effective in pruning false positive videos, but also highly complementary to concept-free search, leading to large margin of improvement compared to state-of-the-art approaches.

Research Area(s)

  • ad-hoc video search, concept-based search, concept-free search, interpretable video search

Bibliographic Note

Full text of this publication does not contain sufficient affiliation information. With consent from the author(s) concerned, the Research Unit(s) information for this record is based on the existing academic department affiliation of the author(s).

Citation Format(s)

Interpretable Embedding for Ad-Hoc Video Search. / Wu, Jiaxin; Ngo, Chong-Wah.
MM '20 - Proceedings of the 28th ACM International Conference on Multimedia. Association for Computing Machinery, Inc, 2020. p. 3357-3366 (MM - Proceedings of the ACM International Conference on Multimedia).

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review