TY - GEN
T1 - Parallelizing Big De Bruijn Graph Traversal for Genome Assembly on GPU Clusters
AU - Qiu, Shuang
AU - Feng, Zonghao
AU - Luo, Qiong
N1 - Publication details (e.g. title, author(s), publication statuses and dates) are captured on an “AS IS” and “AS AVAILABLE” basis at the time of record harvesting from the data source. Suggestions for further amendments or supplementary information can be sent to [email protected].
PY - 2019/4
Y1 - 2019/4
N2 - De Bruijn graph traversal is a critical step in de novo assemblers. It uses the graph structure to analyze genome sequences and is both memory space intensive and time consuming. To improve the efficiency, we develop ParaGraph, which parallelizes De Bruijn graph traversal on a cluster of GPU-equipped computer nodes. With effective vertex partitioning and fine-grained parallel algorithms, ParaGraph utilizes all cores of each CPU and GPU, all CPUs and GPUs in a computer node, and all computer nodes of a cluster. Our results show that ParaGraph is able to traverse billion-node graphs within three minutes on a cluster of six GPU-equipped computer nodes. It is an order of magnitude faster than the state-of-the-art shared memory based assemblers, and more than five times faster than the current distributed assemblers. © 2019, Springer Nature Switzerland AG.
AB - De Bruijn graph traversal is a critical step in de novo assemblers. It uses the graph structure to analyze genome sequences and is both memory space intensive and time consuming. To improve the efficiency, we develop ParaGraph, which parallelizes De Bruijn graph traversal on a cluster of GPU-equipped computer nodes. With effective vertex partitioning and fine-grained parallel algorithms, ParaGraph utilizes all cores of each CPU and GPU, all CPUs and GPUs in a computer node, and all computer nodes of a cluster. Our results show that ParaGraph is able to traverse billion-node graphs within three minutes on a cluster of six GPU-equipped computer nodes. It is an order of magnitude faster than the state-of-the-art shared memory based assemblers, and more than five times faster than the current distributed assemblers. © 2019, Springer Nature Switzerland AG.
UR - https://www.scopus.com/pages/publications/85065405183
UR - https://www.scopus.com/record/pubmetrics.uri?eid=2-s2.0-85065405183&origin=recordpage
U2 - 10.1007/978-3-030-18590-9_68
DO - 10.1007/978-3-030-18590-9_68
M3 - RGC 32 - Refereed conference paper (with host publication)
SN - 9783030185893
VL - 11448 LNCS
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 466
EP - 470
BT - Database Systems for Advanced Applications - DASFAA 2019 International Workshops: BDMS, BDQM, and GDMA, Proceedings
PB - Springer Verlag
T2 - 24th International Conference on Database Systems for Advanced Applications, DASFAA 2019
Y2 - 22 April 2019 through 25 April 2019
ER -