z-logo
open-access-imgOpen Access
Predictor correlation impacts machine learning algorithms: implications for genomic studies
Author(s) -
Kristin K. Nicodemus,
James D. Malley
Publication year - 2009
Publication title -
bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.599
H-Index - 390
eISSN - 1367-4811
pISSN - 1367-4803
DOI - 10.1093/bioinformatics/btp331
Subject(s) - random forest , permutation (music) , correlation , multivariate statistics , machine learning , computer science , inference , tree (set theory) , algorithm , artificial intelligence , matthews correlation coefficient , regression , feature (linguistics) , statistics , mathematics , mathematical analysis , linguistics , physics , geometry , philosophy , acoustics , support vector machine
The advent of high-throughput genomics has produced studies with large numbers of predictors (e.g. genome-wide association, microarray studies). Machine learning algorithms (MLAs) are a computationally efficient way to identify phenotype-associated variables in high-dimensional data. There are important results from mathematical theory and numerous practical results documenting their value. One attractive feature of MLAs is that many operate in a fully multivariate environment, allowing for small-importance variables to be included when they act cooperatively. However, certain properties of MLAs under conditions common in genomic-related data have not been well-studied--in particular, correlations among predictors pose a problem.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom