Skip to main navigation Skip to search Skip to main content

Data mining-based model and risk prediction of colorectal cancer by using secondary health data: A systematic review

Hailun Liang, Lei Yang, Lei Tao, Leiyu Shi, Wuyang Yang, Jiawei Bai, Da Zheng, Ning Wang*, Jiafu Ji*

*Corresponding author for this work

    Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

    Abstract

    Objective: Prevention and early detection of colorectal cancer (CRC) can increase the chances of successful treatment and reduce burden. Various data mining technologies have been utilized to strengthen the early detection of CRC in primary care. Evidence synthesis on the model's effectiveness is scant. This systematic review synthesizes studies that examine the effect of data mining on improving risk prediction of CRC.

    Methods: The PRISMA framework guided the conduct of this study. We obtained papers via PubMed, Cochrane Library, EMBASE and Google Scholar. Quality appraisal was performed using Downs and Black's quality checklist. To evaluate the performance of included models, the values of specificity and sensitivity were comparted, the values of area under the curve (AUC) were plotted, and the median of overall AUC of included studies was computed.

    Results: A total of 316 studies were reviewed for full text. Seven articles were included. Included studies implement techniques including artificial neural networks, Bayesian networks and decision trees. Six articles reported the overall model accuracy. Overall, the median AUC is 0.8243 [interquartile range (IQR): 0.8050-0.8886]. In the two articles that reported comparison results with traditional models, the data mining method performed better than the traditional models, with the best AUC improvement of 10.7%.

    Conclusions: The adoption of data mining technologies for CRC detection is at an early stage. Limited numbers of included articles and heterogeneity of those studies implied that more rigorous research is expected to further investigate the techniques' effects.

    Original languageEnglish
    Pages (from-to)242-251
    Number of pages10
    JournalChinese Journal of Cancer Research
    Volume32
    Issue number2
    DOIs
    Publication statusPublished - 2020

    UN SDGs

    This output contributes to the following UN Sustainable Development Goals (SDGs)

    1. SDG 3 - Good Health and Well-being
      SDG 3 Good Health and Well-being

    Research Keywords

    • Systematic review
    • colorectal cancer
    • disease detection
    • data mining
    • PRIMARY-CARE
    • BIG DATA
    • OPPORTUNITIES
    • VALIDATION
    • DIAGNOSIS
    • RECORDS

    Fingerprint

    Dive into the research topics of 'Data mining-based model and risk prediction of colorectal cancer by using secondary health data: A systematic review'. Together they form a unique fingerprint.

    Cite this