Stroke Mortality Prediction Based on Ensemble Learning and the Combination of Structured and Textual Data
Research output: Journal Publications and Reviews (RGC: 21, 22, 62) › 21_Publication in refereed journal › peer-review
Author(s)
Related Research Unit(s)
Detail(s)
Original language | English |
---|---|
Article number | 106176 |
Number of pages | 34 |
Journal / Publication | Computers in Biology and Medicine |
Online published | 28 Oct 2022 |
Publication status | Online published - 28 Oct 2022 |
Link(s)
DOI | DOI |
---|---|
Permanent Link | https://scholars.cityu.edu.hk/en/publications/publication(12e71291-4c20-4489-b427-b9559685ad8e).html |
Abstract
For severe cerebrovascular diseases such as stroke, the prediction of short-term mortality of patients has tremendous medical significance. In this study, we combined machine learning models Random Forest classifier (RF), Adaptive Boosting (AdaBoost), Extremely Randomised Trees (ExtraTree) classifier, XGBoost classifier, TabNet, and DistilBERT to construct a multi-level prediction model that used bioassay data and radiology text reports from haemorrhagic and ischaemic stroke patients to predict six-month mortality. The performances of the prediction models were measured using the area under the receiver operating characteristic curve (AUROC), the area under the precision-recall curve (AUPRC), precision, recall, and F1-score. The prediction models were built with the use of data from 19,616 haemorrhagic stroke patients and 50,178 ischaemic stroke patients. Novel six-month mortality prediction models for these patients were developed, which enhanced the performance of the prediction models by combining laboratory test data, structured data, and textual radiology report data. The achieved performances were as follows: AUROC = 0.89, AUPRC = 0.70, precision = 0.52, recall = 0.78, and F1 score = 0.63 for haemorrhagic patients, and AUROC = 0.88, AUPRC = 0.54, precision = 0.34, recall = 0.80, and F1 score = 0.48 for ischaemic patients. Such models could be used for mortality risk assessment and early identification of high-risk stroke patients. This could contribute to more efficient utilisation of healthcare resources for stroke survivors.
Research Area(s)
- Deep Learning, Machine Learning, Modelling and Prediction
Citation Format(s)
Stroke Mortality Prediction Based on Ensemble Learning and the Combination of Structured and Textual Data. / Huang, Ruixuan; Liu, Jundong; Wan, Tsz Kin et al.
In: Computers in Biology and Medicine, 28.10.2022.Research output: Journal Publications and Reviews (RGC: 21, 22, 62) › 21_Publication in refereed journal › peer-review