z-logo
Premium
METHODOLOGICAL INSIGHTS: Increasing the value of principal components analysis for simplifying ecological data: a case study with rivers and river birds
Author(s) -
VAUGHAN I. P.,
ORMEROD S. J.
Publication year - 2005
Publication title -
journal of applied ecology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 2.503
H-Index - 181
eISSN - 1365-2664
pISSN - 0021-8901
DOI - 10.1111/j.1365-2664.2005.01038.x
Subject(s) - principal component analysis , cluster analysis , interpretability , categorical variable , multivariate statistics , correspondence analysis , variable (mathematics) , data mining , multiple correspondence analysis , ecology , computer science , statistics , indicator value , variance (accounting) , mathematics , artificial intelligence , biology , mathematical analysis , accounting , business
Summary1 Two priorities for applied ecologists are to (i) maintain quantitative rigour with minimal resources and (ii) ensure that multivariate results are readily understood by end users. Habitat descriptions and other complex data present particular challenges. 2 Principal components analysis (PCA) is often used to reduce data and stabilize subsequent statistical analyses. Interpretation can be difficult, however, and PCA is optimized for quantitative (cf. categorical) data. Moreover, future applications (e.g. in predicting species’ distributions) require the recording of all contributing variables irrespective of cost or importance. 3 We considered the potential benefits of two PCA variants. First, we considered whether a cluster analysis on the correlation matrix of independent variables (i.e. variable clustering), followed by a PCA within each cluster, produced a more easily interpreted output than conventional PCA, while simultaneously reducing costs. Secondly, we considered whether a generalized PCA capable of analysing qualitative data could out‐perform conventional PCA when ecological data include ordinal variables. As a case study, we used data from river habitat survey (RHS), a key applied tool in river ecology that uses more than 100 variables to describe river structure and relies heavily on three‐point ordinal scales. In distribution models that linked river birds to RHS, we compared the interpretability and efficiency of variable clustering and generalized PCA against conventional PCA. 4 While variable clustering gave similar predictive performance to PCA, habitat factors generated by the former were more readily interpreted than conventional principal components. Of the two cluster‐scoring methods, optimally scaled PCA explained 24% more variance in the first principal component and marginally improved the accuracy of distribution models. 5 Synthesis and applications . Initial variable clustering makes PCA more interpretable and will benefit the understanding of research results and their translation into management. Variable clustering should also reduce costs as variables contributing to unused clusters need not be recorded in future (cf. PCA). Optimal scaling further increases the versatility of PCA: qualitative ecological data (e.g. habitat categories) can be analysed in the same way as quantitative data, with real benefits to applied research. With cost constraints and the need for dissemination key applied issues, our results offer an important potential advance.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here