Abstract
This study focuses on multimodal topic modeling and attempts to separate public topics (shared across modalities) from private topics (unique to each modality) hidden in text and image data. To address this issue, we propose a novel Disentangled Multimodal Neural Topic Model (DMNTM). Specifically, we design the modality-specific encoder with an independence constraint to capture private topics, and the public encoder with a product-of-experts module to extract cross-modal shared topics. We conduct extensive experiments on six public datasets, including multimodal online reviews from Amazon, posts from Flickr, tweets from Twitter, and webpages from Wikipedia. Compared with state-of-the-art methods, we find that DMNTM significantly improves topic modeling performance in terms of perplexity, coherence, diversity, and topic quality over the best baseline. In two downstream tasks, including recommendation and sentiment classification, DMNTM further improves the performance. These results show that disentangling public and private topics effectively enhances both the quality and utility of multimodal representations. © 2026 Elsevier Ltd.
| Original language | English |
|---|---|
| Article number | 104683 |
| Number of pages | 16 |
| Journal | Information Processing & Management |
| Volume | 63 |
| Issue number | 5 |
| Online published | 17 Feb 2026 |
| DOIs | |
| Publication status | Online published - 17 Feb 2026 |
| Externally published | Yes |
Funding
This work is supported by the National Natural Science Foundation of China (72342011, 72101072, 72271084, 72171071, 72571096, 72322019), the Fundamental Research Funds for the Central Universities (JZ2024HGTG0316,PA2023IISL0103), and the National Engineering Laboratory for Big Data Distribution and Exchange Technologies.
Research Keywords
- Multimodal analysis
- Neural topic modeling
- Disentanglement learning
- Public and private topics
- Deep learning
Fingerprint
Dive into the research topics of 'A disentangled multimodal neural topic model'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver