Premium
Multiple correspondence discriminant analysis: An application to detect stratification in copy number variation
Author(s) -
Caceres, Alejandro,
Basagaña Xavier,
Gonzalez Juan R.
Publication year - 2010
Publication title -
statistics in medicine
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.996
H-Index - 183
eISSN - 1097-0258
pISSN - 0277-6715
DOI - 10.1002/sim.3890
Subject(s) - linear discriminant analysis , variation (astronomy) , statistics , stratification (seeds) , copy number variation , computer science , artificial intelligence , mathematics , biology , genetics , gene , seed dormancy , physics , germination , botany , genome , dormancy , astrophysics
We illustrate the use of multiple correspondence analysis (MCA) to correct for population stratification of copy number alteration data. In addition, we propose the use of multiple correspondence discriminant analysis (MCDA) to identify an optimal set of copy number variants (CNVs) that correctly infers the population stratification of a CNV map. Within MCDA, we highlight the novel use of correlation with class directions for variable ranking. We found a set of 20 CNVs with 98 per cent predictability in a CNV map of the HapMap populations. On this sample, the selection of variables based on centroid ranking outperformed the most common practice of ranking variables with their correlation to the principal axes. Copyright © 2010 John Wiley & Sons, Ltd.