IMperm : a fast and comprehensive IMmune Paired-End Reads Merger for sequencing data

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

View graph of relations

Author(s)

  • Jia Ju
  • Chaohui Li
  • Shixin Lu
  • Zefeng Lu
  • Liya Lin
  • Xiao Liu

Related Research Unit(s)

Detail(s)

Original languageEnglish
Article numberbbad080
Number of pages10
Journal / PublicationBriefings in Bioinformatics
Volume24
Issue number2
Online published9 Mar 2023
Publication statusPublished - Mar 2023

Abstract

The adaptive immune receptor repertoire (AIRR), consisting of T- and B-cell receptors, is the core component of the immune system. The AIRR sequencing is commonly used in cancer immunotherapy and minimal residual disease (MRD) detection of leukemia and lymphoma. The AIRR is captured by primers and sequenced to yield paired-end (PE) reads. The PE reads could be merged into one sequence by the overlapped region between them. However, the wide range of AIRR data raises the difficulty, so a special tool is required. We developed a software package for IMmune PE reads merger of sequencing data, named IMperm. We used the k-mer-and-vote strategy to pin down the overlapped region rapidly. IMperm could handle all types of PE reads, eliminate adapter contamination and successfully merge low-quality and minor/non-overlapping reads. Compared with existing tools, IMperm performed better in both simulated and sequencing data. Notably, IMperm was well suited to processing the data of MRD detection in leukemia and lymphoma and detected 19 novel MRD clones in 14 patients with leukemia from previously published data. Additionally, IMperm can handle PE reads from other sources, and we demonstrated its effectiveness on two genomic and one cell-free deoxyribonucleic acid datasets. IMperm is implemented in the C programming language and consumes little runtime and memory. It is freely available at https://github.com/zhangwei2015/IMperm. © The Author(s) 2023.

Research Area(s)

  • software, high-throughput sequencing, paired-end reads assembly, immune repertoire, MRD detection

Bibliographic Note

Information for this record is supplemented by the author(s) concerned.