mapAlign : An Efficient Approach for Mapping and Aligning Long Reads to Reference Genomes

Research output: Chapters, Conference Papers, Creative and Literary Works (RGC: 12, 32, 41, 45)32_Refereed conference paper (with ISBN/ISSN)

View graph of relations

Related Research Unit(s)

Detail(s)

Original languageEnglish
Title of host publicationBioinformatics Research and Applications - 16th International Symposium, ISBRA 2020, Proceedings
EditorsZhipeng Cai, Ion Mandoiu, Giri Narasimhan
PublisherSpringer
Pages105-118
ISBN (Electronic)9783030578213
ISBN (Print)9783030578206
Publication statusPublished - Dec 2020

Publication series

NameLecture Notes in Computer Science
Volume12304
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Title16th International Symposium on Bioinformatics Research and Applications (ISBRA 2020)
PlaceRussian Federation
CityMoscow
Period1 - 4 December 2020

Abstract

Long reads play an important role for the identification of structural variants, sequencing repetitive regions, phasing of alleles, etc. In this paper, we propose a new approach for mapping long reads to reference genomes. We also propose a new method to generate accurate alignments of the long reads and the corresponding segments of reference genome. The new mapping algorithm is based on the longest common sub-sequence with distance constraints. The new (local) alignment algorithms is based on the idea of recursive alignment of variable size k-mers. Experiments show that our new method can generate better alignments in terms of both identity and alignment scores for both Nanopore and SMRT data sets. In particular, our method can align 91.53% and 85.36% of letters on reads to identical letters on reference genomes for human individuals of Nanopore and SMRT data sets, respectively. The state-of-the-art method can only align 88.44% and 79.08% letters of reads for Nanopore and SMRT data sets, respectively. Our method is also faster than the state-of-the-art method.

Research Area(s)

  • LCS with distance constraints, Local alignment of long reads, Long read mapping, Variable length k-mer alignment

Citation Format(s)

mapAlign : An Efficient Approach for Mapping and Aligning Long Reads to Reference Genomes. / Yang, Wen; Wang, Lusheng.

Bioinformatics Research and Applications - 16th International Symposium, ISBRA 2020, Proceedings. ed. / Zhipeng Cai; Ion Mandoiu; Giri Narasimhan. Springer, 2020. p. 105-118 (Lecture Notes in Computer Science; Vol. 12304).

Research output: Chapters, Conference Papers, Creative and Literary Works (RGC: 12, 32, 41, 45)32_Refereed conference paper (with ISBN/ISSN)