IMperm: a fast and comprehensive IMmune Paired-End Reads Merger for sequencing data

Wei Zhang, Jia Ju, Yong Zhou, Teng Xiong, Mengyao Wang, Chaohui Li, Shixin Lu, Zefeng Lu, Liya Lin, Xiao Liu, Shuai Cheng Li*

*Corresponding author for this work

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

1 Citation (Scopus)

Abstract

The adaptive immune receptor repertoire (AIRR), consisting of T- and B-cell receptors, is the core component of the immune system. The AIRR sequencing is commonly used in cancer immunotherapy and minimal residual disease (MRD) detection of leukemia and lymphoma. The AIRR is captured by primers and sequenced to yield paired-end (PE) reads. The PE reads could be merged into one sequence by the overlapped region between them. However, the wide range of AIRR data raises the difficulty, so a special tool is required. We developed a software package for IMmune PE reads merger of sequencing data, named IMperm. We used the k-mer-and-vote strategy to pin down the overlapped region rapidly. IMperm could handle all types of PE reads, eliminate adapter contamination and successfully merge low-quality and minor/non-overlapping reads. Compared with existing tools, IMperm performed better in both simulated and sequencing data. Notably, IMperm was well suited to processing the data of MRD detection in leukemia and lymphoma and detected 19 novel MRD clones in 14 patients with leukemia from previously published data. Additionally, IMperm can handle PE reads from other sources, and we demonstrated its effectiveness on two genomic and one cell-free deoxyribonucleic acid datasets. IMperm is implemented in the C programming language and consumes little runtime and memory. It is freely available at https://github.com/zhangwei2015/IMperm. © The Author(s) 2023.
Original languageEnglish
Article numberbbad080
Number of pages10
JournalBriefings in Bioinformatics
Volume24
Issue number2
Online published9 Mar 2023
DOIs
Publication statusPublished - Mar 2023

Bibliographical note

Information for this record is supplemented by the author(s) concerned.

Funding

CityU/UGC Research Matching Grant Scheme (9229012 and 9229013).

Research Keywords

  • software
  • high-throughput sequencing
  • paired-end reads assembly
  • immune repertoire
  • MRD detection

Fingerprint

Dive into the research topics of 'IMperm: a fast and comprehensive IMmune Paired-End Reads Merger for sequencing data'. Together they form a unique fingerprint.

Cite this