Detecting Symptom Errors in Neural Machine Translation of Patient Health Information on Depressive Disorders : Developing Interpretable Bayesian Machine Learning Classifiers

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

3 Scopus Citations
View graph of relations

Author(s)

  • Meng Ji
  • Mengdan Zhao
  • Tianqi Zhou
  • Fan Yang
  • Xiaobo Qian
  • Tianyong Hao

Related Research Unit(s)

Detail(s)

Original languageEnglish
Article number771562
Journal / PublicationFrontiers in Psychiatry
Volume12
Online published21 Oct 2021
Publication statusPublished - Oct 2021

Link(s)

Abstract

Background: Due to its convenience, wide availability, low usage cost, neural machine translation (NMT) has increasing applications in diverse clinical settings and web-based self-diagnosis of diseases. Given the developing nature of NMT tools, this can pose safety risks to multicultural communities with limited bilingual skills, low education, and low health literacy. Research is needed to scrutinise the reliability, credibility, usability of automatically translated patient health information.

Objective: We aimed to develop high-performing Bayesian machine learning classifiers to assist clinical professionals and healthcare workers in assessing the quality and usability of NMT on depressive disorders. The tool did not require any prior knowledge from frontline health and medical professionals of the target language used by patients.

Methods: We used Relevance Vector Machine (RVM) to increase generalisability and clinical interpretability of classifiers. It is a typical sparse Bayesian classifier less prone to overfitting with small training datasets. We optimised RVM by leveraging automatic recursive feature elimination and expert feature refinement from the perspective of health linguistics. We evaluated the diagnostic utility of the Bayesian classifier under different probability cut-offs in terms of sensitivity, specificity, positive and negative likelihood ratios against clinical thresholds for diagnostic tests. Finally, we illustrated interpretation of RVM tool in clinic using Bayes' nomogram.

Results: After automatic and expert-based feature optimisation, the best-performing RVM classifier (RVM_DUFS12) gained the highest AUC (0.8872) among 52 competing models with distinct optimised, normalised features sets. It also had statistically higher sensitivity and specificity compared to other models. We evaluated the diagnostic utility of the best-performing model using Bayes' nomogram: it had a positive likelihood ratio (LR+) of 4.62 (95% C.I.: 2.53, 8.43), and the associated posterior probability (odds) was 83% (5.0) (95% C.I.: 73%, 90%), meaning that approximately 10 in 12 English texts with positive test are likely to contain information that would cause clinically significant conceptual errors if translated by Google; it had a negative likelihood ratio (LR-) of 0.18 (95% C.I.: 0.10,0.35) and associated posterior probability (odds) was 16% (0.2) (95% C.I: 10%, 27%), meaning that about 10 in 12 English texts with negative test can be safely translated using Google.

Research Area(s)

  • depressive disorder, mental health, neural machine translation (NMT), translation error detection, machine learning application

Citation Format(s)

Detecting Symptom Errors in Neural Machine Translation of Patient Health Information on Depressive Disorders: Developing Interpretable Bayesian Machine Learning Classifiers. / Xie, Wenxiu; Ji, Meng; Zhao, Mengdan et al.
In: Frontiers in Psychiatry, Vol. 12, 771562, 10.2021.

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

Download Statistics

No data available