Skip to main navigation Skip to search Skip to main content

Cardiac health assessment across scenarios and devices using a multimodal foundation model pretrained on data from 1.7 million individuals

Xiao Gu*, Wei Tang, Jinpei Han, Veer Sangha, Fenglin Liu, Shreyank N. Gowda, Antonio H. Ribeiro, Patrick Schwab, Kim Branson, Lei Clifton, Antonio Luiz P. Ribeiro, Zhangdaihong Liu*, David A. Clifton*

*Corresponding author for this work

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

2 Downloads (CityUHK Scholars)

Abstract

Cardiovascular diseases remain a major contributor to the global burden of healthcare, highlighting the importance of accurate and scalable methods for cardiac monitoring. Cardiac biosignals, most notably electrocardiograms (ECG) and photoplethysmograms, are essential for diagnosing, preventing and managing cardiovascular conditions across clinical and home settings. However, their acquisition varies substantially across scenarios and devices, whereas existing analytical models often rely on homogeneous datasets and static bespoke models, limiting their robustness and generalizability in diverse real-world contexts. Here we present a cardiac sensing foundation model (CSFM) that leverages transformer architectures and a generative masked pretraining strategy to learn unified representations from heterogeneous health records. CSFM is pretrained on a multimodal integration of data from various large-scale datasets, comprising cardiac signals from approximately 1.7 million individuals and their corresponding clinical or machine-generated text reports. The embeddings derived from CSFM act as effective, transferable features across diverse cardiac sensing scenarios, supporting a seamless adaptation to the varied input configurations and sensor modalities. Extensive evaluations across diagnostic tasks, demographic recognition, vital sign measurement, clinical outcome prediction and ECG question answering demonstrate that CSFM consistently outperforms traditional one-modal-one-task approaches. Notably, CSFM maintains favourable performance across both 12-lead and single-lead ECGs, as well as in scenarios involving ECG only, photoplethysmogram only or a combination of both. This highlights its potential as a versatile and scalable foundation for comprehensive cardiac monitoring. © The Author(s) 2026.
Original languageEnglish
Pages (from-to)220-233
JournalNature Machine Intelligence
Volume8
Issue number2
Online published24 Feb 2026
DOIs
Publication statusPublished - Feb 2026

Funding

D.A.C. was funded by an NIHR Research Professorship, a Royal Academy of Engineering Research Chair and the InnoHK Hong Kong Centre for Cerebro-cardiovascular Engineering (COCHE) and was supported by the National Institute for Health Research (NIHR) Oxford Biomedical Research Centre (BRC) and the Pandemic Sciences Institute at the University of Oxford. A.L.P.R. is supported in part by National Council for Scientific and Technological Development (CNPq), Minas Gerais State Foundation for Research Support (FAPEMIG), Innovation Center on Artificial Intelligence for Health (CIIA-S) and Institute for Health Assessment and Translation for Chronic and Neglected Diseases of High Relevance (IATS-CARE).

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 3 - Good Health and Well-being
    SDG 3 Good Health and Well-being

Publisher's Copyright Statement

  • This full text is made available under CC-BY 4.0. https://creativecommons.org/licenses/by/4.0/

Fingerprint

Dive into the research topics of 'Cardiac health assessment across scenarios and devices using a multimodal foundation model pretrained on data from 1.7 million individuals'. Together they form a unique fingerprint.

Cite this