TY - JOUR
T1 - Protein secondary structure prediction using NMR chemical shift data
AU - Zhao, Yuzhong
AU - Alipanahi, Babak
AU - Li, Shuai Cheng
AU - Li, Ming
PY - 2010/10
Y1 - 2010/10
N2 - Accurate determination of protein secondary structure from the chemical shift information is a key step for NMR tertiary structure determination. Relatively few work has been done on this subject. There needs to be a systematic investigation of algorithms that are (a) robust for large datasets; (b) easily extendable to (the dynamic) new databases; and (c) approaching to the limit of accuracy. We introduce new approaches using k-nearest neighbor algorithm to do the basic prediction and use the BCJR algorithm to smooth the predictions and combine different predictions from chemical shifts and based on sequence information only. Our new system, SUCCES, improves the accuracy of all existing methods on a large dataset of 805 proteins (at 86% Q3 accuracy and at 92.6% accuracy when the boundary residues are ignored), and it is easily extendable to any new dataset without requiring any new training. The software is publicly available at http://monod.uwaterloo.ca/nmr/succes. © 2010 Imperial College Press.
AB - Accurate determination of protein secondary structure from the chemical shift information is a key step for NMR tertiary structure determination. Relatively few work has been done on this subject. There needs to be a systematic investigation of algorithms that are (a) robust for large datasets; (b) easily extendable to (the dynamic) new databases; and (c) approaching to the limit of accuracy. We introduce new approaches using k-nearest neighbor algorithm to do the basic prediction and use the BCJR algorithm to smooth the predictions and combine different predictions from chemical shifts and based on sequence information only. Our new system, SUCCES, improves the accuracy of all existing methods on a large dataset of 805 proteins (at 86% Q3 accuracy and at 92.6% accuracy when the boundary residues are ignored), and it is easily extendable to any new dataset without requiring any new training. The software is publicly available at http://monod.uwaterloo.ca/nmr/succes. © 2010 Imperial College Press.
KW - chemical shift
KW - NMR
KW - Protein structure prediction
UR - http://www.scopus.com/inward/record.url?scp=77958544970&partnerID=8YFLogxK
UR - https://www.scopus.com/record/pubmetrics.uri?eid=2-s2.0-77958544970&origin=recordpage
U2 - 10.1142/S0219720010004987
DO - 10.1142/S0219720010004987
M3 - RGC 21 - Publication in refereed journal
C2 - 20981892
SN - 0219-7200
VL - 8
SP - 867
EP - 884
JO - Journal of Bioinformatics and Computational Biology
JF - Journal of Bioinformatics and Computational Biology
IS - 5
ER -