Premium
Knowledge‐based voting algorithm for automated protein functional annotation †
Author(s) -
Yu G.X.,
Glass E.M.,
Karonis N.T.,
Maltsev N.
Publication year - 2005
Publication title -
proteins: structure, function, and bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.699
H-Index - 191
eISSN - 1097-0134
pISSN - 0887-3585
DOI - 10.1002/prot.20652
Subject(s) - annotation , voting , computer science , function (biology) , data mining , similarity (geometry) , machine learning , majority rule , artificial intelligence , image (mathematics) , biology , evolutionary biology , politics , political science , law
Abstract Automated annotation of high‐throughput genome sequences is one of the earliest steps toward a comprehensive understanding of the dynamic behavior of living organisms. However, the step is often error‐prone because of its underlying algorithms, which rely mainly on a simple similarity analysis, and lack of guidance from biological rules. We present herein a knowledge‐based protein annotation algorithm. Our objectives are to reduce errors and to improve annotation confidences. This algorithm consists of two major components: a knowledge system, called “RuleMiner,” and a voting procedure. The knowledge system, which includes biological rules and functional profiles for each function, provides a platform for seamless integration of multiple sequence analysis tools and guidance for function annotation. The voting procedure, which relies on the knowledge system, is designed to make (possibly) unbiased judgments in functional assignments among complicated, sometimes conflicting, information. We have applied this algorithm to 10 prokaryotic bacterial genomes and observed a significant improvement in annotation confidences. We also discuss the current limitations of the algorithm and the potential for future improvement. Proteins 2005. © 2005 Wiley‐Liss, Inc.