TY - JOUR
T1 - SuperTAD-Fast
T2 - Accelerating Topologically Associating Domains Detection Through Discretization
AU - LING, Zhao
AU - ZHANG, Yu Wei
AU - LI, Shuai Cheng
PY - 2024/9
Y1 - 2024/9
N2 - High-throughput chromosome conformation capture (Hi-C) technology captures spatial interactions of DNA sequences into matrices, and software tools are developed to identify topologically associating domains (TADs) from the Hi-C matrices. With structural information theory, SuperTAD adopted a dynamic programming approach to find the TAD hierarchy with minimal structural entropy. However, the algorithm suffers from high time complexity. To accelerate this algorithm, we design and implement an approximation algorithm with a theoretical performance guarantee. We implemented a package, SuperTAD-Fast. Using Hi-C matrices and simulated data, we demonstrated that SuperTAD-Fast achieved great runtime improvement compared with SuperTAD. SuperTAD-Fast shows high consistency and significant enrichment of structural proteins from Hi-C data of human cell lines in comparison with the existing six hierarchical TADs detecting methods. © Mary Ann Liebert, Inc.
AB - High-throughput chromosome conformation capture (Hi-C) technology captures spatial interactions of DNA sequences into matrices, and software tools are developed to identify topologically associating domains (TADs) from the Hi-C matrices. With structural information theory, SuperTAD adopted a dynamic programming approach to find the TAD hierarchy with minimal structural entropy. However, the algorithm suffers from high time complexity. To accelerate this algorithm, we design and implement an approximation algorithm with a theoretical performance guarantee. We implemented a package, SuperTAD-Fast. Using Hi-C matrices and simulated data, we demonstrated that SuperTAD-Fast achieved great runtime improvement compared with SuperTAD. SuperTAD-Fast shows high consistency and significant enrichment of structural proteins from Hi-C data of human cell lines in comparison with the existing six hierarchical TADs detecting methods. © Mary Ann Liebert, Inc.
KW - discretization
KW - dynamic programming
KW - Hi-C
KW - structural information theory
KW - topologically associating domains
UR - http://www.scopus.com/inward/record.url?scp=85203855320&partnerID=8YFLogxK
UR - https://www.scopus.com/record/pubmetrics.uri?eid=2-s2.0-85203855320&origin=recordpage
U2 - 10.1089/cmb.2024.0490
DO - 10.1089/cmb.2024.0490
M3 - RGC 21 - Publication in refereed journal
C2 - 39047029
SN - 1066-5277
VL - 31
SP - 784
EP - 796
JO - Journal of Computational Biology
JF - Journal of Computational Biology
IS - 9
ER -