Premium
Optimal Representation of Supplementary Variables in Biplots from Principal Component Analysis and Correspondence Analysis
Author(s) -
Graffelman Jan,
AlujaBanet Tomàs
Publication year - 2003
Publication title -
biometrical journal
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.108
H-Index - 63
eISSN - 1521-4036
pISSN - 0323-3847
DOI - 10.1002/bimj.200390027
Subject(s) - biplot , principal component analysis , correspondence analysis , mathematics , statistics , context (archaeology) , representation (politics) , variable (mathematics) , multidimensional scaling , procrustes analysis , regression analysis , interpretation (philosophy) , multiple correspondence analysis , computer science , geometry , mathematical analysis , biochemistry , chemistry , politics , political science , genotype , law , gene , programming language , paleontology , biology
Abstract This paper treats the topic of representing supplementary variables in biplots obtained by principal component analysis (PCA) and correspondence analysis (CA). We follow a geometrical approach where we minimize errors that are obtained when the scores of the PCA or CA solution are projected onto a vector that represents a supplementary variable. This paper shows that optimal directions for supplementary variables can be found by solving a regression problem, and justifies that earlier formulae from Gabriel are optimal in the least squares sense. We derive new results regarding the geometrical properties, goodness of fit statistics and the interpretation of supplementary variables. It is shown that supplementary variables can be represented by plotting their correlation coefficients with the axes of the biplot only when the proper type of scaling is used. We discuss supplementary variables in an ecological context and give illustrations with data from an environmental monitoring survey.