Premium
Predicting protein folding rates using the concept of Chou's pseudo amino acid composition
Author(s) -
Guo Jianxiu,
Rao Nini,
Liu Guangxiong,
Yang Yong,
Wang Gang
Publication year - 2011
Publication title -
journal of computational chemistry
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.907
H-Index - 188
eISSN - 1096-987X
pISSN - 0192-8651
DOI - 10.1002/jcc.21740
Subject(s) - pseudo amino acid composition , folding (dsp implementation) , jackknife resampling , sequence (biology) , protein folding , amino acid , protein sequencing , composition (language) , algorithm , matthews correlation coefficient , computer science , computational biology , correlation coefficient , biological system , peptide sequence , chemistry , mathematics , biochemistry , biology , statistics , artificial intelligence , machine learning , engineering , linguistics , philosophy , dipeptide , support vector machine , electrical engineering , gene , estimator
One of the most important challenges in computational and molecular biology is to understand the relationship between amino acid sequences and the folding rates of proteins. Recent works suggest that topological parameters, amino acid properties, chain length and the composition index relate well with protein folding rates, however, sequence order information has seldom been considered as a property for predicting protein folding rates. In this study, amino acid sequence order was used to derive an effective method, based on an extended version of the pseudo‐amino acid composition, for predicting protein folding rates without any explicit structural information. Using the jackknife cross validation test, the method was demonstrated on the largest dataset (99 proteins) reported. The method was found to provide a good correlation between the predicted and experimental folding rates. The correlation coefficient is 0.81 (with a highly significant level) and the standard error is 2.46. The reported algorithm was found to perform better than several representative sequence‐based approaches using the same dataset. The results indicate that sequence order information is an important determinant of protein folding rates. ©2011 Wiley Periodicals, Inc. J Comput Chem 2011.