Abstract
Motivation: The microbial community plays an essential role in human diseases and physiological activities. The functions of microbes can differ due to strain-level differences in the genome sequences. Shotgun metagenomic sequencing allows us to profile the strains in microbial communities practically. However, current methods are underdeveloped due to the highly similar sequences among strains. We observe that strains genotypes at the same single nucleotide variant (SNV) locus can be speculated by the genotype frequencies. Also, the variants in different loci covered by the same reads can provide evidence that they reside on the same strain.
Results: These insights inspire us to design PStrain, an optimization method that utilizes genotype frequencies and the reads which cover multiple SNV loci to profile strains iteratively based on SNVs in a set of MetaPhlAn2 marker genes. Compared to the state-of-art methods, PStrain, on average, improved the performance of inferring strains abundances and genotypes by 87.75% and 59.45%, respectively. We have applied the PStrain package to the dataset with two cohorts of colorectal cancer (CRC) and found that the sequences of Bacteroides coprocola strains are significantly different between CRC and control samples, which is the first time to report the potential role of B.coprocola in the gut microbiota of CRC.
Results: These insights inspire us to design PStrain, an optimization method that utilizes genotype frequencies and the reads which cover multiple SNV loci to profile strains iteratively based on SNVs in a set of MetaPhlAn2 marker genes. Compared to the state-of-art methods, PStrain, on average, improved the performance of inferring strains abundances and genotypes by 87.75% and 59.45%, respectively. We have applied the PStrain package to the dataset with two cohorts of colorectal cancer (CRC) and found that the sequences of Bacteroides coprocola strains are significantly different between CRC and control samples, which is the first time to report the potential role of B.coprocola in the gut microbiota of CRC.
| Original language | English |
|---|---|
| Pages (from-to) | 5499–5506 |
| Journal | Bioinformatics |
| Volume | 36 |
| Issue number | 22-23 |
| Online published | 21 Dec 2020 |
| DOIs | |
| Publication status | Published - Dec 2020 |
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 3 Good Health and Well-being
Fingerprint
Dive into the research topics of 'PStrain: an iterative microbial strains profiling algorithm for shotgun metagenomic sequencing data'. Together they form a unique fingerprint.Projects
- 1 Finished
-
GRF: Algorithms and Models for Local Genomics Map of Oncogenic Virus Integration
LI, S. (Principal Investigator / Project Coordinator)
1/10/16 → 8/09/20
Project: Research
Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver