Multiobjective Genome-Wide RNA-Binding Event Identification From CLIP-Seq Data
Research output: Journal Publications and Reviews › RGC 21 - Publication in refereed journal › peer-review
Author(s)
Related Research Unit(s)
Detail(s)
Original language | English |
---|---|
Pages (from-to) | 5811-5824 |
Journal / Publication | IEEE Transactions on Cybernetics |
Volume | 51 |
Issue number | 12 |
Online published | 10 Jan 2020 |
Publication status | Published - Dec 2021 |
Link(s)
Abstract
RNA-binding proteins (RBPs) are the master regulators of mRNA processing, which are vital players for the post-transcriptional control of gene expression. In recent years, crosslinking immunoprecipitation sequencing (CLIP-seq) technologies have enabled us to sequence massive amounts of genome-wide RNA-binding event data. Its increasing availability provides opportunities to identify protein-RNA interactions on a genome-wide scale. Genome-wide RNA-binding event detection methods have been developed to the understanding of the proteins' functions within cellular processes. Unfortunately, those methods often suffer from realistic restrictions, such as high costs, intensive computation, high dimensionality, numerical instability, and data sparsity. We present a computational method [multiobjective forest algorithm (MFA)] to identify protein-RNA interactions from CLIP-seq data by synergizing multiobjective biogeography-based optimization (BBO) with random forest (RF). Since most of the tree-structured classifiers in RF are unnecessarily bulky with extra time costs and memory consumption, multiobjective BBO is designed to prune the unsuitable tree-structured classifiers dynamically. Moreover, to direct the evolution dynamics of the MFA, two objective functions are formulated to balance model generality and complexity for robust performance. To validate our MFA method, we compare its performance across 31 large-scale CLIP-seq datasets. The experimental results demonstrate that MFA can obtain superior performance over the current state-of-the-art methods. Mechanistic insights are also revealed and discussed to explore the multifaceted aspects of MFA through data source importance analysis, matrix rank estimations, seeding component perturbations, and multiobjective optimization methodology comparisons.
Research Area(s)
- Crosslinking immunoprecipitation sequencing (CLIP-seq) data, multiobjective optimization, RNA-binding proteins (RBPs)
Citation Format(s)
Multiobjective Genome-Wide RNA-Binding Event Identification From CLIP-Seq Data. / Li, Xiangtao; Zhang, Shixiong; Wong, Ka-Chun.
In: IEEE Transactions on Cybernetics, Vol. 51, No. 12, 12.2021, p. 5811-5824.
In: IEEE Transactions on Cybernetics, Vol. 51, No. 12, 12.2021, p. 5811-5824.
Research output: Journal Publications and Reviews › RGC 21 - Publication in refereed journal › peer-review