Multimodal Data with Expert Knowledge Integration for Improved Pneumonia Diagnosis

Student thesis: Doctoral Thesis

Abstract

Pneumonia continues to be a pressing global public health challenge, disproportionately affecting vulnerable populations such as older adults and individuals with pre-existing pulmonary conditions. According to the World Health Organization, pneumonia is one of the leading causes of morbidity and mortality worldwide, accounting for over 2.5 million deaths annually. This disease imposes a significant burden on high-risk groups, including the elderly and children under five years old. In older adults, comorbidities such as chronic obstructive pulmonary disease and weakened immune systems further exacerbate the risks, complicating timely diagnosis and effective treatment. Diagnostic challenges are amplified by variability in disease presentation and overlapping symptoms with other pulmonary conditions. These factors underscore the critical need for innovative diagnostic methodologies that integrate expert clinical knowledge with advanced technologies, such as artificial intelligence. By enabling earlier, more precise, and interpretable diagnoses, these advancements have the potential to significantly improve patient outcomes, reduce healthcare disparities, and alleviate the global health burden of pneumonia.

This dissertation explores the integration of multimodal data and domain-specific knowledge to advance pneumonia diagnostics. It introduces a series of methodologies underpinned by explainable artificial intelligence principles and state-of-the-art deep learning frameworks, demonstrating their efficacy across diverse clinical applications.

The research begins with an interpretable pneumonia detection framework that combines neural networks with Bayesian models to analyze a large dataset of over 35,000 cases, achieving improved diagnostic precision and interpretability. Building on this, the CheXMed algorithm was developed, a multimodal learning approach specifically designed for elderly patients. By integrating chest X-ray images and clinical notes, CheXMed significantly outperforms existing models in detecting pneumonia among this vulnerable population.

Further, this dissertation presents the Multimodal Learning Network, which formalizes diagnostic rules based on clinical guidelines using Markov logic to enhance accuracy and transparency. The framework exemplifies the synergistic potential of combining multimodal data sources and standardized medical knowledge in AI-driven diagnostics.

The study also introduces a novel explainable framework using Graph Neural Networks and counterfactual reasoning. By leveraging prototype learning and hierarchical graph constructions, this approach provides intrinsic interpretability and precise diagnostic outcomes. The integration of multimodal patient data further enriches diagnostic reliability and fosters clinical trust. Collectively, this work underscores the transformative potential of combining AI with domain expertise to create patient-centered, interpretable, and effective diagnostic tools, paving the way for a new era in precision medicine.
Date of Award13 May 2025
Original languageEnglish
Awarding Institution
  • City University of Hong Kong
SupervisorLinyan LI (Supervisor) & Qingpeng ZHANG (Co-supervisor)

Keywords

  • Multimodal data
  • Bayesian networks
  • Multimodal learning network
  • Graph neural networks
  • Pneumonia diagnosis

Cite this

'