Identifying protein-kinase-specific phosphorylation sites based on the baggingadaboost ensemble approach

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

22 Scopus Citations
View graph of relations


Related Research Unit(s)


Original languageEnglish
Article number5427103
Pages (from-to)132-143
Journal / PublicationIEEE Transactions on Nanobioscience
Issue number2
Publication statusPublished - Jun 2010


Protein phosphorylation is an important step in many biological processes, such as cell cycles, membrane transport, apoptosis, etc. In order to obtain more useful information about protein phosphorylation, it is necessary to develop a robust, stable, and accurate approach to predict phosphorylation sites. Although there exist a number of approaches to predict phosphorylation sites, such as those based on neural network and the support vector machine, they only use a single classifier. In general, the prediction results obtained by these approaches are not very stable and robust. In this paper, we design a new classifier ensemble approach called BaggingAdaBoost ensemble (BAE) for the prediction of eukaryotic protein phosphorylation sites, which incorporates the bagging technique and the AdaBoost technique into the classifier framework to improve the accuracy, stability, and robustness of the final result. To our knowledge, this is the first time in which a combined bagging and boosting ensemble approach is applied to predict phosphorylation sites. Our prediction system based on BAE focuses on six kinase families: CDK, CK2, MAPK, PKA, PKC, and SRC. BAE achieves good performance in these six families, and the accuracies of the prediction system for these families are $\hbox{0.8634}$, $\hbox{0.8721}$, $\hbox{0.8542}$ , $\hbox{0.8537}$, $\hbox{0.8052}$, and $\hbox{0.7432}$, respectively. © 2006 IEEE.

Research Area(s)

  • Adaptive boosting (AdaBoost), Bagging, Ensemble, Kinase family, Phosphorylation sites, Prediction