z-logo
open-access-imgOpen Access
Gene Function Prediction from Functional Association Networks Using Kernel Partial Least Squares Regression
Author(s) -
Sonja Lehtinen,
Jonathan Lees,
Jürg Bähler,
John ShaweTaylor,
Christine Orengo
Publication year - 2015
Publication title -
plos one
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.99
H-Index - 332
ISSN - 1932-6203
DOI - 10.1371/journal.pone.0134668
Subject(s) - computer science , benchmark (surveying) , regression , data mining , partial least squares regression , function (biology) , artificial intelligence , machine learning , independence (probability theory) , compass , association (psychology) , kernel (algebra) , gene ontology , protein function prediction , benchmarking , gene regulatory network , protein function , statistics , mathematics , gene , biology , philosophy , gene expression , biochemistry , cartography , geodesy , epistemology , combinatorics , marketing , evolutionary biology , business , geography
With the growing availability of large-scale biological datasets, automated methods of extracting functionally meaningful information from this data are becoming increasingly important. Data relating to functional association between genes or proteins, such as co-expression or functional association, is often represented in terms of gene or protein networks. Several methods of predicting gene function from these networks have been proposed. However, evaluating the relative performance of these algorithms may not be trivial: concerns have been raised over biases in different benchmarking methods and datasets, particularly relating to non-independence of functional association data and test data. In this paper we propose a new network-based gene function prediction algorithm using a com mute-time kernel and pa rtial least s quare s regression (Compass). We compare Compass to GeneMANIA, a leading network-based prediction algorithm, using a number of different benchmarks, and find that Compass outperforms GeneMANIA on these benchmarks. We also explicitly explore problems associated with the non-independence of functional association data and test data. We find that a benchmark based on the Gene Ontology database, which, directly or indirectly, incorporates information from other databases, may considerably overestimate the performance of algorithms exploiting functional association data for prediction.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom