Cancer Early Detection and Heterogeneity Dissection Based on Genome-Wide Molecular Data Analysis


Student thesis: Doctoral Thesis

View graph of relations


Related Research Unit(s)


Awarding Institution
Award date26 Jun 2023


Cancer was the second leading cause of death worldwide with 19.3 million new cancer cases diagnosed and 9.9 million deaths in 2020. It has been predicted that cancer will become the leading cause of death in this century if the current trend continues. As for China, it accounted for 24% of new cases and 30% of deaths of the world in 2020. If the cancer is detected at the early stage promptly, the treatment will be more effective and patients’ survival status will be drastically improved. Due to the heterogeneity of cancer, the same therapeutic strategy can result in different responses in different patients. With continuedly studying and understanding of molecular changes in cancer, in precision medicine, the utmost significant challenges are cancer early detection and deciphering cancer heterogeneity through molecular data profiling.

Cancer biomarker and molecular subtyping are the best strategies to address these questions. Recent studies have made progress in cancer biomarker for early detection and molecular subtyping for heterogeneity dissection. But it does have limitations. with the benefit of recent technologies, such as next-generation sequencing, liquid biopsy, big data and machine learning, emerging opportunities lead us to overcome the limitations by analyzing genome-wide molecular data.

The work in this thesis focuses on biomarker identification for cancer early detection and multi-omic subtyping for cancer heterogeneity dissection by genome-wide data analysis. The chapters presented by this thesis are described below:

Chapter 1: An introduction of the current study background including the cancer cases and death statistics and molecular changes. There are two major challenges in precision medicine: cancer early detection and precision treatment. To overcome these two challenges, cancer biomarker identification for early detection and molecular subtyping for heterogeneity dissection are the optimal strategies. The current progresses in biomarker identification and molecular subtyping studies are described. I also discuss the limitations and opportunities for precision medicine.

Chapter 2: In this chapter, a miRNA-based signature for the early detection of esophageal squamous cell carcinoma (ESCC) was developed through a rigorous study design. Three tissue miRNA datasets were used to identify a miRNA signature that discriminated ESCC from normal tissues. The robustness of this signature was assessed in serum from two retrospective cohorts. A risk-scoring model was derived, then the performance of the miRNA signature was evaluated in two prospective cohorts of patients with ESCC.

Chapter 3: In this chapter, a combination of cell-free miRNA and exosomal miRNA signatures was developed for the early detection of pancreatic ductal adenocarcinoma (PDAC). The candidate miRNAs were identified by genome-wide analysis of small RNA sequencing data. Machine learning algorithms were used to prioritize a panel of cell-free and exosomal miRNAs that discriminated patients from control controls. Subsequently, the performance of the biomarkers was trained and validated in independent cohorts.

Chapter 4: This chapter described the analysis result of the multi-omic subtyping study of ESCC. In this study, I identified four subtypes by multi-omic data fusion. The molecular and clinical associations with four subtypes were explored and validated. Drug sensitivity and consensus molecular subtyping were then performed. This study also collected data from seven existing subtyping system and performed a consensus subtyping study.

Chapter 5: This chapter summarized the studies of the thesis and gave an outlook for future research.

    Research areas

  • Subtyping, Biomarker, Cancer, Bioinformatics