SuperTAD-Fast : Accelerating Topologically Associating Domains Detection Through Discretization

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

View graph of relations

Detail(s)

Original languageEnglish
Pages (from-to)784-796
Journal / PublicationJournal of Computational Biology
Volume31
Issue number9
Online published4 Sept 2024
Publication statusPublished - Sept 2024

Abstract

High-throughput chromosome conformation capture (Hi-C) technology captures spatial interactions of DNA sequences into matrices, and software tools are developed to identify topologically associating domains (TADs) from the Hi-C matrices. With structural information theory, SuperTAD adopted a dynamic programming approach to find the TAD hierarchy with minimal structural entropy. However, the algorithm suffers from high time complexity. To accelerate this algorithm, we design and implement an approximation algorithm with a theoretical performance guarantee. We implemented a package, SuperTAD-Fast. Using Hi-C matrices and simulated data, we demonstrated that SuperTAD-Fast achieved great runtime improvement compared with SuperTAD. SuperTAD-Fast shows high consistency and significant enrichment of structural proteins from Hi-C data of human cell lines in comparison with the existing six hierarchical TADs detecting methods. © Mary Ann Liebert, Inc.

Research Area(s)

  • discretization, dynamic programming, Hi-C, structural information theory, topologically associating domains