Sensitive Cancer Detection Using Circulating Cell-free DNA


Student thesis: Doctoral Thesis

View graph of relations

Related Research Unit(s)


Awarding Institution
Award date28 Apr 2020


Detecting and monitoring of tumor-derived circulating DNA (circulating tumor DNA; ctDNA) in plasma is rapidly becoming a diagnostic, prognostic and predictive tool in cancer patient care. Studies revealed that ctDNA enables a non-invasive earlier diagnosis of cancer and monitoring recurrence or resistance for timely intervention. Selection of appropriate intervention strategy requires a reliable technology capable of genotyping multiple genes in parallel and accurately identifying diagnostic or actionable targets. However, low efficiency of current technologies in enriching tumor DNA admixed in an overwhelming non-tumor derived cell-free DNA (cfDNA) in circulation is a major challenge in realizing the potential of ctDNA.

To address these challenges, we developed a highly efficient next-generation sequencing (NGS)-based method capable of detecting rare mutations from as low as 1 ng input DNA with highly sensitive and specific mutation calling algorithm. The method is a bi-strand multiplex target enrichment approach relying on generating multiple copies of target region through linear amplification and ligating single-stranded amplicons onto unique molecular identifier (UMI)-tagged adapters, hence termed Linear Amplification and Single-strand Target enrichment Sequencing (LAST-Seq), thereby minimizing the possibility of missing rare variants and calling false positives. Further, our custom bin-based algorithm and UMI-aware bioinformatics pipeline segregates the background noises into different bins using flanking sequences.

For multiplexed target enrichment, extended annealing and extension time with appropriately selected DNA polymerase and optimized conditions achieved a uniform coverage (99.9% at >0.2x mean depth) for a panel of targets including genomic loci difficult to target conventionally. Error rates in NGS sequences differ by DNA base substitution types (transition and transversion), and further differ by sequence context. We modeled these random sequencing noises by segregating them into the 12 possible substitutions. Higher transition than transversion error rate, as is well-known, combined with sequence-context-specific error models allow for sensitive calling of low-level true mutation signals that submerge under the global noise level.

Performance of the method was evaluated with a reference standard DNA (HD734, HorizonDX) that contains 44 mutations, and experimented with dilutions for low mutant allele frequencies (MAFs) and low inputs. Using 10 ng of the DNA and diluted for 2-fold (MAF ~ 0.5%), LAST-Seq achieved 100% sensitivity and specificity. When as little as 1 ng of the DNA was inputted (undiluted, MAF ~ 1%), LAST-Seq efficiently converted the DNA fragments into effective libraries (averaged mapping rate to reference genome 99.83%, on-target 98.1%, uniformity 99.9%) and detected 43 out of 44 mutations (97.7% sensitivity), approaching an efficiency that misses targets by random sampling alone. No false positive was called. Then, we successfully validated the method with expanded custom primers panel detecting relevant variants in lung cancer using ~ 1 mL plasma. In analysis of cfDNA obtained from plasma of 77 lung cancer cases and 63 non-cancer controls, mutations were identified in 47 (61%) of the cancer patients, and 5 (7.9%) non-cancer controls – indicating presence features of somatic mosaicism. For verification, analysis of 6 KRAS-mutated and 64 KRAS-wildtype cfDNA with droplet digital PCR (ddPCR) showed 4 out 6 positive results while all the 64 samples wildtype had no mutation. Generally, these results demonstrate the superior efficiency of LAST-Seq and technical feasibility to analyze trace amount of input material with high accuracy.

    Research areas

  • circulating tumor DNA, efficiency, next generation sequencing, LAST-Seq, lung cancer