Skip to main navigation Skip to search Skip to main content

MedDiTPro: A Prompt-Guided Diffusion Transformer for Multimodal Longitudinal Medical Data Synthesis

Yuan Zhong, Xiaochen Wang, Jiaqi Wang, Xiaokun Zhang*, Fenglong Ma

*Corresponding author for this work

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

Abstract

Diffusion models have recently emerged as a state-of-the-art approach for synthetic Electronic Health Record (EHR) generation, offering superior fidelity and diversity over traditional generative models. However, existing diffusion-based methods struggle with unique challenges: limited representation learning and modality utilization, where they fail to explicitly capture inter-modality dependencies and fine-grained code-level interactions, and constrained adaptability due to reliance on U-Net-based architectures, which are not well-suited for handling the heterogeneous and evolving nature of EHR data. Furthermore, current evaluation paradigms rely on either perplexity-based sequence modeling or global distributional measures, lacking robustness in assessing both intra-visit code relationships and inter-visit temporal patterns. To address these limitations, we propose MedDiTPro, a diffusion transformer-based framework that enhances multimodal EHR generation by integrating structured modality-aware guidance. Through a unified transformer for intra-visit representation learning, a modality-specific and datawise prompt learner, and a diffusion transformer with structured guidance, MedDiTPro achieves state-of-the-art performance in generating diverse and clinically meaningful synthetic records. Extensive experiments on publicly available datasets demonstrate that MedDiTPro achieves state-of-the-art fidelity, privacy preservation, and utility. © 2025 ACM.
Original languageEnglish
Title of host publicationKDD '25
Subtitle of host publicationProceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V.2
PublisherAssociation for Computing Machinery
Pages4086-4097
Number of pages12
ISBN (Print)979-8-4007-1454-2
DOIs
Publication statusPublished - 3 Aug 2025
Event31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2025) - Toronto, Canada
Duration: 3 Aug 20257 Aug 2025
https://kdd2025.kdd.org/
https://dl.acm.org/conference/kdd/proceedings

Publication series

NameProceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Volume2
ISSN (Print)2154-817X

Conference

Conference31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2025)
PlaceCanada
CityToronto
Period3/08/257/08/25
Internet address

Bibliographical note

Full text of this publication does not contain sufficient affiliation information. With consent from the author(s) concerned, the Research Unit(s) information for this record is based on the existing academic department affiliation of the author(s).

Research Keywords

  • diffusion models
  • electronic health records
  • medical data synthesis
  • multimodal data mining

Fingerprint

Dive into the research topics of 'MedDiTPro: A Prompt-Guided Diffusion Transformer for Multimodal Longitudinal Medical Data Synthesis'. Together they form a unique fingerprint.

Cite this