Abstract
In this paper, we propose several active learning strategies to train classifiers for phosphorylation site prediction. When combined with support vector machine, we show that active learning with SVM is able to produce classifiers that give comparable or better phosphorylation site prediction performance than conventional SVM techniques and, at the same time, require a significantly less number of annotated protein training samples. The result has both conceptual and practical implications in protein prediction: it exploits information inherent in the large scale database of non-annotated protein samples and reduces the amount of manual labor required for protein annotation. To the best of our knowledge, active learning has not been explored in phosphorylation sites prediction. Several active learning strategies: single-running mode, batch-running mode with sample and support vector diversity, were investigated for phosphorylation sites prediction in this work. Our experiments have shown that active learning with SVM is able to reduce the effort of protein annotation by 6.6% to 25.7% to yield similar prediction performance as compared with conventional SVM technique. © 2008 IEEE.
Original language | English |
---|---|
Title of host publication | Proceedings of the International Joint Conference on Neural Networks |
Pages | 3158-3165 |
DOIs | |
Publication status | Published - 2008 |
Event | 2008 International Joint Conference on Neural Networks, IJCNN 2008 - Hong Kong, China Duration: 1 Jun 2008 → 8 Jun 2008 |
Conference
Conference | 2008 International Joint Conference on Neural Networks, IJCNN 2008 |
---|---|
Country/Territory | China |
City | Hong Kong |
Period | 1/06/08 → 8/06/08 |