Practical limits of function prediction | Zendy

Devos Damien | Zendy; Valencia Alfonso | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Premium

Practical limits of function prediction

Author(s) -

Devos Damien,

Valencia Alfonso

Publication year - 2000

Publication title -

proteins: structure, function, and bioinformatics

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 1.699

H-Index - 191

eISSN - 1097-0134

pISSN - 0887-3585

DOI - 10.1002/1097-0134(20001001)41:1<98::aid-prot120>3.0.co;2-s

Subject(s) - protein function prediction , function (biology) , pairwise comparison , sequence (biology) , similarity (geometry) , identity (music) , limit (mathematics) , protein function , computational biology , protein sequencing , computer science , mathematics , pattern recognition (psychology) , artificial intelligence , biology , peptide sequence , genetics , image (mathematics) , physics , mathematical analysis , gene , acoustics

The widening gap between known protein sequences and their functions has led to the practice of assigning a potential function to a protein on the basis of sequence similarity to proteins whose function has been experimentally investigated. We present here a critical view of the theoretical and practical bases for this approach. The results obtained by analyzing a significant number of true sequence similarities, derived directly from structural alignments, point to the complexity of function prediction. Different aspects of protein function, including (i) enzymatic function classification, (ii) functional annotations in the form of key words, (iii) classes of cellular function, and (iv) conservation of binding sites can only be reliably transferred between similar sequences to a modest degree. The reason for this difficulty is a combination of the unavoidable database inaccuracies and the plasticity of protein function. In addition, analysis of the relationship between sequence and functional descriptions defines an empirical limit for pairwise‐based functional annotations, namely, the three first digits of the six numbers used as descriptors of protein folds in the FSSP database can be predicted at an average level as low as 7.5% sequence identity, two of the four EC digits at 15% identity, half of the SWISS‐PROT key words related to protein function would require 20% identity, and the prediction of half of the residues in the binding site can be made at the 30% sequence identity level. Proteins 2000;41:98–107. © 2000 Wiley‐Liss, Inc.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore