Neighbourhood Structure Preserving Cross-Modal Embedding for Video Hyperlinking
Research output: Journal Publications and Reviews › RGC 21 - Publication in refereed journal › peer-review
Author(s)
Related Research Unit(s)
Detail(s)
Original language | English |
---|---|
Article number | 8736841 |
Pages (from-to) | 188-200 |
Journal / Publication | IEEE Transactions on Multimedia |
Volume | 22 |
Issue number | 1 |
Online published | 14 Jun 2019 |
Publication status | Published - Jan 2020 |
Link(s)
Abstract
Video hyperlinking is a task aiming to enhance the accessibility of large archives, by establishing links between fragments of videos. The links model the aboutness between fragments for efficient traversal of video content. This paper addresses the problem of link construction from the perspective of cross-modal embedding. To this end, a generalized multi-modal auto-encoder is proposed. The encoder learns two embeddings from visual and speech modalities, respectively, whereas each of the embeddings performs self-modal and cross-modal translation of modalities. Furthermore, to preserve the neighbourhood structure of fragments, which is important for video hyperlinking, the auto-encoder is devised to model data distribution of fragments in a dataset. Experiments are conducted on Blip10000 dataset using the anchor fragments provided by TRECVid Video Hyperlinking (LNK) task over the years of 2016 and 2017. This paper shares the empirical insights on a number of issues in cross-modal learning, including the preservation of neighbourhood structure in embedding, model fine-tuning and issue of missing modality, for video hyperlinking.
Research Area(s)
- cross-modal translation, structure-preserving learning, Video hyperlinking
Citation Format(s)
Neighbourhood Structure Preserving Cross-Modal Embedding for Video Hyperlinking. / Hao, Yanbin; Ngo, Chong-Wah; Huet, Benoit.
In: IEEE Transactions on Multimedia, Vol. 22, No. 1, 8736841, 01.2020, p. 188-200.
In: IEEE Transactions on Multimedia, Vol. 22, No. 1, 8736841, 01.2020, p. 188-200.
Research output: Journal Publications and Reviews › RGC 21 - Publication in refereed journal › peer-review