Evaluating the absolute quality of a single protein model using structural features and support vector machines | Zendy

Wang Zheng | Zendy; Tegge Allison N. | Zendy; Cheng Jianlin | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Premium

Evaluating the absolute quality of a single protein model using structural features and support vector machines

Author(s) -

Wang Zheng,

Tegge Allison N.,

Cheng Jianlin

Publication year - 2008

Publication title -

proteins: structure, function, and bioinformatics

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 1.699

H-Index - 191

eISSN - 1097-0134

pISSN - 0887-3585

DOI - 10.1002/prot.22275

Subject(s) - ranking (information retrieval) , support vector machine , computer science , correlation , artificial intelligence , regression , mean absolute error , quality (philosophy) , rank (graph theory) , sequence (biology) , machine learning , statistics , pattern recognition (psychology) , data mining , mathematics , mean squared error , chemistry , biochemistry , combinatorics , philosophy , geometry , epistemology

Knowing the quality of a protein structure model is important for its appropriate usage. We developed a model evaluation method to assess the absolute quality of a single protein model using only structural features with support vector machine regression. The method assigns an absolute quantitative score (i.e. GDT‐TS) to a model by comparing its secondary structure, relative solvent accessibility, contact map, and beta sheet structure with their counterparts predicted from its primary sequence. We trained and tested the method on the CASP6 dataset using cross‐validation. The correlation between predicted and true scores is 0.82. On the independent CASP7 dataset, the correlation averaged over 95 protein targets is 0.76; the average correlation for template‐based and ab initio targets is 0.82 and 0.50, respectively. Furthermore, the predicted absolute quality scores can be used to rank models effectively. The average difference (or loss) between the scores of the top‐ranked models and the best models is 5.70 on the CASP7 targets. This method performs favorably when compared with the other methods used on the same dataset. Moreover, the predicted absolute quality scores are comparable across models for different proteins. These features make the method a valuable tool for model quality assurance and ranking. Proteins 2009. © 2008 Wiley‐Liss, Inc.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here

Accelerating Research