Hybrid Deep Neural Network-Based Cross-Modal Image and Text Retrieval Method for Large-Scale Data

Baohua Qiang, Ruidong Chen, Yuan Xie, Mingliang Zhou*, Riwei Pan, Tian Zhao

*Corresponding author for this work

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

5 Citations (Scopus)

Abstract

In this paper, we propose the hybrid deep neural network-based cross-modal image and text retrieval method to explore complex cross-modal correlation by considering multi-layer learning. First, we propose intra-modal and inter-modal representations to achieve a complementary single-modal representation that preserves the correlation between the modalities. Second, we build an association between different modalities through hierarchical learning to further mine the fine-grained latent semantic association among multimodal data. The experimental results show that our algorithm substantially enhances retrieval performance and consistently outperforms four comparison methods.
Original languageEnglish
Article number2150018
JournalJournal of Circuits, Systems and Computers
Volume30
Issue number1
Online published5 Aug 2020
DOIs
Publication statusPublished - Jan 2021

Research Keywords

  • Cross-modal
  • hybrid deep neural network
  • image and text retrieval
  • large-scale data

Fingerprint

Dive into the research topics of 'Hybrid Deep Neural Network-Based Cross-Modal Image and Text Retrieval Method for Large-Scale Data'. Together they form a unique fingerprint.

Cite this