Stroke Mortality Prediction Based on Ensemble Learning and the Combination of Structured and Textual Data

Research output: Journal Publications and Reviews (RGC: 21, 22, 62)21_Publication in refereed journalpeer-review

View graph of relations

Author(s)

  • Damrongrat Siriwanna
  • Yat Ming Peter Woo
  • Asmir Vodencarevic
  • Chi-Wah Wong

Detail(s)

Original languageEnglish
Article number106176
Number of pages34
Journal / PublicationComputers in Biology and Medicine
Online published28 Oct 2022
Publication statusOnline published - 28 Oct 2022

Abstract

For severe cerebrovascular diseases such as stroke, the prediction of short-term mortality of patients has tremendous medical significance. In this study, we combined machine learning models Random Forest classifier (RF), Adaptive Boosting (AdaBoost), Extremely Randomised Trees (ExtraTree) classifier, XGBoost classifier, TabNet, and DistilBERT to construct a multi-level prediction model that used bioassay data and radiology text reports from haemorrhagic and ischaemic stroke patients to predict six-month mortality. The performances of the prediction models were measured using the area under the receiver operating characteristic curve (AUROC), the area under the precision-recall curve (AUPRC), precision, recall, and F1-score. The prediction models were built with the use of data from 19,616 haemorrhagic stroke patients and 50,178 ischaemic stroke patients. Novel six-month mortality prediction models for these patients were developed, which enhanced the performance of the prediction models by combining laboratory test data, structured data, and textual radiology report data. The achieved performances were as follows: AUROC = 0.89, AUPRC = 0.70, precision = 0.52, recall = 0.78, and F1 score = 0.63 for haemorrhagic patients, and AUROC = 0.88, AUPRC = 0.54, precision = 0.34, recall = 0.80, and F1 score = 0.48 for ischaemic patients. Such models could be used for mortality risk assessment and early identification of high-risk stroke patients. This could contribute to more efficient utilisation of healthcare resources for stroke survivors.

Research Area(s)

  • Deep Learning, Machine Learning, Modelling and Prediction

Citation Format(s)