Project Details
Description
Multimodal Large Language Models (MLLMs) have demonstrated remarkable capabilities in visual question answering, multimodal reasoning, and human. However, their deployment in real-world applications remains limited due to the frequent occurrence of hallucinations, where models generate confident yet incorrect responses. A key challenge lies in the lack of reliable uncertainty quantification methods capable of distinguishing between uncertainty caused by ambiguous data and uncertainty resulting from model limitations. This project aims to develop a principled framework for uncertainty estimation in MLLMs based on causal invariance. Byanalyzing how model predictions change when query-irrelevant visual features are removed, the project will isolate epistemic uncertainty that reflects model reasoning limitations. The research will design novel metrics that measure semantic divergence between original predictions and causally filtered inputs, and further develop efficient geometric proxies operating in embedding space for scalable deployment. Through systematic experiments on multimodal benchmarks, the project will investigate how causal-aware uncertainty estimation improves hallucination detection, reliability assessment, and adaptive inference strategies. The expected outcomes include new uncertainty quantification algorithms for multimodal models and practical techniques that enhance the safety and trustworthiness of AI systems deployed in real-world multimodal environments.
| Project number | 7020206 |
|---|---|
| Grant type | REG-Small Scale |
| Status | Active |
| Effective start/end date | 1/05/26 → … |
Fingerprint
Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.