CancerEMC : frontline non-invasive cancer screening from circulating protein biomarkers and mutations in cell-free DNA

Research output: Journal Publications and Reviews (RGC: 21, 22, 62)21_Publication in refereed journalpeer-review

1 Scopus Citations
View graph of relations


Related Research Unit(s)


Original languageEnglish
Pages (from-to)3319–3327
Journal / PublicationBioinformatics
Issue number19
Online published30 Jan 2021
Publication statusPublished - 1 Oct 2021


Motivation: The early detection of cancer through accessible blood tests can foster early patient interventions. Although there are developments in cancer detection from cell-free DNA (cfDNA), its accuracy remains speculative. Given its central importance with broad impacts, we aspire to address the challenge.

Methods: A bagging Ensemble Meta Classifier (CancerEMC) is proposed for early cancer detection based on circulating protein biomarkers and mutations in cfDNA from the blood. CancerEMC is generally designed for both binary cancer detection and multi-class cancer type localization. It can address the class imbalance problem in multi-analyte blood test data based on robust oversampling and adaptive synthesis techniques.

Results: Based on the clinical blood test data, we observe that the proposed CancerEMC has outperformed other algorithms and state-of-the-arts studies (including CancerSEEK published in Science, 2018) for cancer detection. The results reveal that our proposed method (i.e., CancerEMC) can achieve the best performance result for both binary cancer classification with 99.1748% accuracy (AUC = 0.999) and localized multiple cancer detection with 74.1214% accuracy (AUC = 0.938). For addressing the data imbalance issue with oversampling techniques, the accuracy can be increased to 91.4966% (AUC = 0.992), where the state-of-the-art method can only be estimated at 69.64% (AUC = 0.921). Similar results can also be observed on independent and isolated testing data.

Research Area(s)