Principal component analysis: a review and recent developments
Author(s) -
Ian T. Jolliffe,
Jorge Cadima
Publication year - 2016
Publication title -
philosophical transactions of the royal society a mathematical physical and engineering sciences
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.074
H-Index - 169
eISSN - 1471-2962
pISSN - 1364-503X
DOI - 10.1098/rsta.2015.0202
Subject(s) - principal component analysis , interpretability , a priori and a posteriori , curse of dimensionality , uncorrelated , computer science , eigenvalues and eigenvectors , variance (accounting) , dimensionality reduction , sparse pca , data mining , principal (computer security) , component (thermodynamics) , artificial intelligence , machine learning , mathematics , statistics , philosophy , physics , accounting , epistemology , quantum mechanics , business , thermodynamics , operating system
Large datasets are increasingly common and are often difficult to interpret. Principal component analysis (PCA) is a technique for reducing the dimensionality of such datasets, increasing interpretability but at the same time minimizing information loss. It does so by creating new uncorrelated variables that successively maximize variance. Finding such new variables, the principal components, reduces to solving an eigenvalue/eigenvector problem, and the new variables are defined by the dataset at hand, nota priori , hence making PCA an adaptive data analysis technique. It is adaptive in another sense too, since variants of the technique have been developed that are tailored to various different data types and structures. This article will begin by introducing the basic ideas of PCA, discussing what it can and cannot do. It will then describe some variants of PCA and their application.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom