The impact of incomplete knowledge on evaluation: an experimental benchmark for protein function prediction
Author(s) -
Curtis Huttenhower,
Matthew Hibbs,
Chad L. Myers,
Amy A. Caudy,
David Hess,
Olga G. Troyanskaya
Publication year - 2009
Publication title -
bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.599
H-Index - 390
eISSN - 1367-4811
pISSN - 1367-4803
DOI - 10.1093/bioinformatics/btp397
Subject(s) - benchmark (surveying) , gold standard (test) , computer science , protein function prediction , inference , machine learning , function (biology) , artificial intelligence , annotation , set (abstract data type) , data mining , protein function , biology , gene , mathematics , biochemistry , statistics , geodesy , evolutionary biology , programming language , geography
Rapidly expanding repositories of highly informative genomic data have generated increasing interest in methods for protein function prediction and inference of biological networks. The successful application of supervised machine learning to these tasks requires a gold standard for protein function: a trusted set of correct examples, which can be used to assess performance through cross-validation or other statistical approaches. Since gene annotation is incomplete for even the best studied model organisms, the biological reliability of such evaluations may be called into question.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom