Exploring Scalable Parallelization for Edit Distance-Based Motif Search
Research output: Journal Publications and Reviews › RGC 21 - Publication in refereed journal › peer-review
Author(s)
Related Research Unit(s)
Detail(s)
Original language | English |
---|---|
Pages (from-to) | 1587-1593 |
Journal / Publication | IEEE/ACM Transactions on Computational Biology and Bioinformatics |
Volume | 20 |
Issue number | 2 |
Online published | 23 Sept 2022 |
Publication status | Published - Mar 2023 |
Link(s)
Abstract
Motif Searching is an important problem that can reveal crucial information from biological data. Since the general motif searching is NP-hard and the volume of biological data is growing exponentially in recent years, there is a pressing need for developing time and space-efficient algorithms to find motifs. In this paper, we explore scalable parallelization for Edit Distance-Based Motif Search (EMS). We introduce two parallel designs, recursEMS which integrates the existing EMS solver into a parallel recursion tree running in multiple processes, and parEMS that presents a novel thread-based method which avoids the storage of redundant motif candidates. To make the parallel designs practical, we implement SPEMS, a Scalability-sensitive Parallel solver for EMS. For any given biological dataset and search instance, SPEMS can provide an EMS parallelization towards the optimal performance, or a sub-optimal performance but being more space efficient. Evaluations on two real-world DNA dataset TRANSFAC and ChIP-seq show that SPEMS can obtain 10× geometric mean speedup over the state-of-the-art at the expense of no less than 74.7% memory overheads, or provide 2.2× geometric mean speedup with the possibility of consuming less memory, when running on a 48-core machine. © 2022 IEEE.
Research Area(s)
- Biology, Costs, DNA, Edit-distance motif search, Parallel algorithms, parallelism, Proteins, Search problems, Time complexity
Citation Format(s)
Exploring Scalable Parallelization for Edit Distance-Based Motif Search. / Qiu, Junqiao; Ebnenasir, Ali.
In: IEEE/ACM Transactions on Computational Biology and Bioinformatics, Vol. 20, No. 2, 03.2023, p. 1587-1593.
In: IEEE/ACM Transactions on Computational Biology and Bioinformatics, Vol. 20, No. 2, 03.2023, p. 1587-1593.
Research output: Journal Publications and Reviews › RGC 21 - Publication in refereed journal › peer-review