Improving residue-residue contact prediction via low-rank and sparse decomposition of residue correlation matrix

Research output: Journal Publications and Reviews (RGC: 21, 22, 62)21_Publication in refereed journalpeer-review

18 Scopus Citations
View graph of relations

Author(s)

  • Haicang Zhang
  • Yujuan Gao
  • Minghua Deng
  • Chao Wang
  • Jianwei Zhu
  • Wei-Mou Zheng
  • Dongbo Bu

Related Research Unit(s)

Detail(s)

Original languageEnglish
Pages (from-to)217-222
Journal / PublicationBiochemical and Biophysical Research Communications
Volume472
Issue number1
Publication statusPublished - 25 Mar 2016

Abstract

Strategies for correlation analysis in protein contact prediction often encounter two challenges, namely, the indirect coupling among residues, and the background correlations mainly caused by phylogenetic biases. While various studies have been conducted on how to disentangle indirect coupling, the removal of background correlations still remains unresolved. Here, we present an approach for removing background correlations via low-rank and sparse decomposition (LRS) of a residue correlation matrix. The correlation matrix can be constructed using either local inference strategies (e.g., mutual information, or MI) or global inference strategies (e.g., direct coupling analysis, or DCA). In our approach, a correlation matrix was decomposed into two components, i.e., a low-rank component representing background correlations, and a sparse component representing true correlations. Finally the residue contacts were inferred from the sparse component of correlation matrix. We trained our LRS-based method on the PSICOV dataset, and tested it on both GREMLIN and CASP11 datasets. Our experimental results suggested that LRS significantly improves the contact prediction precision. For example, when equipped with the LRS technique, the prediction precision of MI and mfDCA increased from 0.25 to 0.67 and from 0.58 to 0.70, respectively (Top L/10 predicted contacts, sequence separation: 5 AA, dataset: GREMLIN). In addition, our LRS technique also consistently outperforms the popular denoising technique APC (average product correction), on both local (MI-LRS: 0.67 vs MI-APC: 0.34) and global measures (mfDCA-LRS: 0.70 vs mfDCA-APC: 0.67). Interestingly, we found out that when equipped with our LRS technique, local inference strategies performed in a comparable manner to that of global inference strategies, implying that the application of LRS technique narrowed down the performance gap between local and global inference strategies. Overall, our LRS technique greatly facilitates protein contact prediction by removing background correlations. An implementation of the approach called COLORS (improving COntact prediction using LOw-Rank and Sparse matrix decomposition) is available from http://protein.ict.ac.cn/COLORS/.

Research Area(s)

  • Background correlation removal, Correlation analysis, Low-rank and sparse matrix decomposition, Protein contacts prediction

Citation Format(s)

Improving residue-residue contact prediction via low-rank and sparse decomposition of residue correlation matrix. / Zhang, Haicang; Gao, Yujuan; Deng, Minghua; Wang, Chao; Zhu, Jianwei; Li, Shuai Cheng; Zheng, Wei-Mou; Bu, Dongbo.

In: Biochemical and Biophysical Research Communications, Vol. 472, No. 1, 25.03.2016, p. 217-222.

Research output: Journal Publications and Reviews (RGC: 21, 22, 62)21_Publication in refereed journalpeer-review