What are the baselines for protein fold recognition? | Zendy

Liam J. McGuffin | Zendy; Kevin Bryson | Zendy; David T. Jones | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

What are the baselines for protein fold recognition?

Author(s) -

Liam J. McGuffin,

Kevin Bryson,

David T. Jones

Publication year - 2001

Publication title -

bioinformatics

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 3.599

H-Index - 390

eISSN - 1367-4811

pISSN - 1367-4803

DOI - 10.1093/bioinformatics/17.1.63

Subject(s) - false positive paradox , fold (higher order function) , computer science , pattern recognition (psychology) , benchmark (surveying) , protein secondary structure , artificial intelligence , sequence (biology) , set (abstract data type) , algorithm , data mining , biology , biochemistry , genetics , geodesy , programming language , geography

What constitutes a baseline level of success for protein fold recognition methods? As fold recognition benchmarks are often presented without any thought to the results that might be expected from a purely random set of predictions, an analysis of fold recognition baselines is long overdue. Given varying amounts of basic information about a protein-ranging from the length of the sequence to a knowledge of its secondary structure-to what extent can the fold be determined by intelligent guesswork? Can simple methods that make use of secondary structure information assign folds more accurately than purely random methods and could these methods be used to construct viable hierarchical classifications? EXPERIMENTS PERFORMED: A number of rapid automatic methods which score similarities between protein domains were devised and tested. These methods ranged from those that incorporated no secondary structure information, such as measuring absolute differences in sequence lengths, to more complex alignments of secondary structure elements. Each method was assessed for accuracy by comparison with the Class Architecture Topology Homology (CATH) classification. Methods were rated against both a random baseline fold assignment method as a lower control and FSSP as an upper control. Similarity trees were constructed in order to evaluate the accuracy of optimum methods at producing a classification of structure.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research