Active learning for the prediction of phosphorylation sites

Jun Jiang, Horace H. S. Ip

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

4 Citations (Scopus)

Abstract

In this paper, we propose several active learning strategies to train classifiers for phosphorylation site prediction. When combined with support vector machine, we show that active learning with SVM is able to produce classifiers that give comparable or better phosphorylation site prediction performance than conventional SVM techniques and, at the same time, require a significantly less number of annotated protein training samples. The result has both conceptual and practical implications in protein prediction: it exploits information inherent in the large scale database of non-annotated protein samples and reduces the amount of manual labor required for protein annotation. To the best of our knowledge, active learning has not been explored in phosphorylation sites prediction. Several active learning strategies: single-running mode, batch-running mode with sample and support vector diversity, were investigated for phosphorylation sites prediction in this work. Our experiments have shown that active learning with SVM is able to reduce the effort of protein annotation by 6.6% to 25.7% to yield similar prediction performance as compared with conventional SVM technique. © 2008 IEEE.
Original languageEnglish
Title of host publicationProceedings of the International Joint Conference on Neural Networks
Pages3158-3165
DOIs
Publication statusPublished - 2008
Event2008 International Joint Conference on Neural Networks, IJCNN 2008 - Hong Kong, China
Duration: 1 Jun 20088 Jun 2008

Conference

Conference2008 International Joint Conference on Neural Networks, IJCNN 2008
Country/TerritoryChina
CityHong Kong
Period1/06/088/06/08

Fingerprint

Dive into the research topics of 'Active learning for the prediction of phosphorylation sites'. Together they form a unique fingerprint.

Cite this