TY - GEN
T1 - Sequence based prediction of protein mutant stability and discrimination of thermophilic proteins
AU - Gromiha, M. Michael
AU - Huang, Liang Tsung
AU - Lai, Lien Fu
PY - 2008/12/5
Y1 - 2008/12/5
N2 - Prediction of protein stability upon amino acid substitution and discrimination of thermophilic proteins from mesophilic ones are important problems in designing stable proteins. We have developed a classification rule generator using the information about wild-type, mutant, three neighboring residues and experimentally observed stability data. Utilizing the rules, we have developed a method based on decision tree for discriminating the stabilizing and destabilizing mutants and predicting protein stability changes upon single point mutations, which showed an accuracy of 82% and a correlation of 0.70, respectively. In addition, we have systematically analyzed the characteristic features of amino acid residues in 3075 mesophilic and 1609 thermophilic proteins belonging to 9 and 15 families, respectively, and developed methods for discriminating them. The method based on neural network could discrimi-nate them at the 5-fold cross-validation accuracy of 89% in a dataset of 4684 proteins and 91% in a test set of 707 proteins.
AB - Prediction of protein stability upon amino acid substitution and discrimination of thermophilic proteins from mesophilic ones are important problems in designing stable proteins. We have developed a classification rule generator using the information about wild-type, mutant, three neighboring residues and experimentally observed stability data. Utilizing the rules, we have developed a method based on decision tree for discriminating the stabilizing and destabilizing mutants and predicting protein stability changes upon single point mutations, which showed an accuracy of 82% and a correlation of 0.70, respectively. In addition, we have systematically analyzed the characteristic features of amino acid residues in 3075 mesophilic and 1609 thermophilic proteins belonging to 9 and 15 families, respectively, and developed methods for discriminating them. The method based on neural network could discrimi-nate them at the 5-fold cross-validation accuracy of 89% in a dataset of 4684 proteins and 91% in a test set of 707 proteins.
UR - http://www.scopus.com/inward/record.url?scp=57049177027&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=57049177027&partnerID=8YFLogxK
U2 - 10.1007/978-3-540-88436-1-1
DO - 10.1007/978-3-540-88436-1-1
M3 - Conference contribution
AN - SCOPUS:57049177027
SN - 3540884343
SN - 9783540884347
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 1
EP - 12
BT - 3rd IAPR International Conference on Pattern Recognition in Bioinformatics, PRIB 2008
T2 - 3rd IAPR International Conference on Pattern Recognition in Bioinformatics, PRIB 2008
Y2 - 15 October 2008 through 17 October 2008
ER -