Dual-perspective hypergraph learning network for multimodal entity and relation extraction

Jie Liu, Hong Zhong, Mingying Xu*, Baowen Wu, Linqi Song, Yinqiao LI, Lei Shi, Feifei Kou

*Corresponding author for this work

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

Abstract

Multimodal Named Entity Recognition (MNER) and Relation Extraction (MRE) identify entities and their semantic relationships within paired image-text data. Currently, graph-based approaches have gained significant attention by constructing cross-modal graph to achieve fine-grained alignment and interaction, demonstrating promising performance on MNER and MRE tasks. However, graph-based approaches primarily rely on pairwise interactions, limiting their ability to model complex global dependencies and leading to semantic alignment bias. To address this, we propose a Dual-perspective Hypergraph Learning Network for Multimodal Entity and Relation Extraction (DHGLN) that captures high-order complex correlations among multiple nodes via semantic perspective and contextual-structure perspective. DHGLN adopts attention mechanism and spectral graph convolution to learn semantic level and contextual-structure level hyperedge features to optimize node representation, achieving competitive and robust performance on both MNER and MRE tasks. Experimental results demonstrate significant improvements, with amazing +6.67 % F1-score gains over state-of-the-art baseline on the Twitter-2015 dataset for MNER task. And it also has demonstrated strong performance on the Twitter-2017 dataset for MNER task and the MNRE dataset for MRE task, highlighting the effectiveness and robustness of our approach. © 2025 Published by Elsevier Ltd.
Original languageEnglish
Article number130290
JournalExpert Systems with Applications
Volume300
Online published8 Nov 2025
DOIs
Publication statusPublished - 5 Mar 2026

Funding

This work is supported by Joint Fund Key Program of the National Natural Science Foundation of China (U23B2029), National Natural Science Foundation of China (62076167), Yuxiu Innovation Project of NCUT (2024NCUTYXCX102), North China University of Technology 2025 Youth Research Special Project (2025NCUTYRSP012), North China University of Technology Research Start-up Fund Project (11005136024XN147-22), Key Laboratory of Public Opinion Governance and Computational Communication (YQKFYB202501), Ministry of Education (EIN2024C006), and Fundamental Research Funds for the Central Universities (CUC25SG013), Ministry of Education Key Laboratory for Intelligent Analysis and Security Governance of Ethnic Languages (ORP-202405).

Research Keywords

  • Multimodal named entity recognition
  • Multimodal relation extraction
  • Hypergraph neural network
  • Hypergraph learning
  • Multimodal alignment

Fingerprint

Dive into the research topics of 'Dual-perspective hypergraph learning network for multimodal entity and relation extraction'. Together they form a unique fingerprint.

Cite this