Premium
Variable selection and interpretation in correlation principal components
Author(s) -
AlKandari Noriah M.,
Jolliffe Ian T.
Publication year - 2005
Publication title -
environmetrics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.68
H-Index - 58
eISSN - 1099-095X
pISSN - 1180-4009
DOI - 10.1002/env.728
Subject(s) - principal component analysis , dimensionality reduction , dimension (graph theory) , interpretation (philosophy) , multivariate statistics , set (abstract data type) , data set , computer science , mathematics , variable (mathematics) , feature selection , statistics , data mining , artificial intelligence , pure mathematics , programming language , mathematical analysis
Principal component analysis (PCA) is a dimension‐reducing tool that replaces the variables in a multivariate data set by a smaller number of derived variables. Dimension reduction is often undertaken to help in interpreting the data set but, as each principal component usually involves all the original variables, interpretation of a PCA can still be difficult. One way to overcome this difficulty is to select a subset of the original variables and use this subset to approximate the principal components. This article reviews a number of techniques for choosing subsets of the variables and examines their merits in terms of preserving the information in the PCA, and in aiding interpretation of the main sources of variation in the data. Copyright © 2005 John Wiley & Sons, Ltd.