Using propensity scores to predict the kinases of unannotated phosphopeptides

Research output: Journal Publications and Reviews (RGC: 21, 22, 62)21_Publication in refereed journalpeer-review

6 Scopus Citations
View graph of relations

Author(s)

  • Qingfeng Chen
  • Yiqi Wang
  • Baoshan Chen
  • Chengqi Zhang
  • Jinyan Li

Related Research Unit(s)

Detail(s)

Original languageEnglish
Pages (from-to)60-76
Journal / PublicationKnowledge-Based Systems
Volume135
Online published14 Aug 2017
Publication statusPublished - 1 Nov 2017

Abstract

Protein phosphorylation is the process of binding a protein kinase to a specific site in a protein substrate for post-translational modification. Thousands of distinct phosphorylation sites have been identified, but most of them are not annotated with any kinase information. This work proposes a novel kinase-subgrouping propensity method (kiSP) to predict the binding kinases for phosphopeptides. Existing methods do not distinguish the residue conservation properties of the kinase family subgroups for annotation. Our method exploits maximum entropy variance to prune non-conserved sites from the subset of phosphopeptides that bind to the same kinase family. We also use maximal mutual information to estimate an appropriate upstream-downstream window size for this subset. A propensity score for every kinase family is calculated from its positive and negative data, which indicates its effectiveness as a site for each test phosphopeptide. Experimental results demonstrate that our method outperforms current algorithms in specificity and sensitivity under cross-validation. kiSP is also demonstrated to correctly predict kinase families for phosphopeptides with unknown kinase information.

Research Area(s)

  • Classification, Entropy, Phosphorylation site, Protein kinases, Variance

Citation Format(s)

Using propensity scores to predict the kinases of unannotated phosphopeptides. / Chen, Qingfeng; Wang, Yiqi; Chen, Baoshan; Zhang, Chengqi; Wang, Lusheng; Li, Jinyan.

In: Knowledge-Based Systems, Vol. 135, 01.11.2017, p. 60-76.

Research output: Journal Publications and Reviews (RGC: 21, 22, 62)21_Publication in refereed journalpeer-review