How well can the accuracy of comparative protein structure models be predicted? | Zendy

Eramian David | Zendy; Eswar Narayanan | Zendy; Shen MinYi | Zendy; Sali Andrej | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Premium

How well can the accuracy of comparative protein structure models be predicted?

Author(s) -

Eramian David,

Eswar Narayanan,

Shen MinYi,

Sali Andrej

Publication year - 2008

Publication title -

protein science

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 3.353

H-Index - 175

eISSN - 1469-896X

pISSN - 0961-8368

DOI - 10.1110/ps.036061.108

Subject(s) - similarity (geometry) , set (abstract data type) , regression , protein structure prediction , sequence (biology) , computer science , correlation , support vector machine , function (biology) , mean squared error , contrast (vision) , statistics , standard deviation , mathematics , artificial intelligence , protein structure , biology , biochemistry , genetics , geometry , evolutionary biology , image (mathematics) , programming language

Comparative structure models are available for two orders of magnitude more protein sequences than are experimentally determined structures. These models, however, suffer from two limitations that experimentally determined structures do not: They frequently contain significant errors, and their accuracy cannot be readily assessed. We have addressed the latter limitation by developing a protocol optimized specifically for predicting the Cα root‐mean‐squared deviation (RMSD) and native overlap (NO3.5Å) errors of a model in the absence of its native structure. In contrast to most traditional assessment scores that merely predict one model is more accurate than others, this approach quantifies the error in an absolute sense, thus helping to determine whether or not the model is suitable for intended applications. The assessment relies on a model‐specific scoring function constructed by a support vector machine. This regression optimizes the weights of up to nine features, including various sequence similarity measures and statistical potentials, extracted from a tailored training set of models unique to the model being assessed: If possible, we use similarly sized models with the same fold; otherwise, we use similarly sized models with the same secondary structure composition. This protocol predicts the RMSD and NO3.5Å errors for a diverse set of 580,317 comparative models of 6174 sequences with correlation coefficients ( r ) of 0.84 and 0.86, respectively, to the actual errors. This scoring function achieves the best correlation compared to 13 other tested assessment criteria that achieved correlations ranging from 0.35 to 0.71.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here

Accelerating Research