Premium
Assessing local influence in principal component analysis with application to haematology study data
Author(s) -
Fung Wing K.,
Gu Hong,
Xiang Liming,
Yau Kelvin K. W.
Publication year - 2006
Publication title -
statistics in medicine
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.996
H-Index - 183
eISSN - 1097-0258
pISSN - 0277-6715
DOI - 10.1002/sim.2747
Subject(s) - principal component analysis , data set , eigenvalues and eigenvectors , data mining , mathematics , perturbation (astronomy) , statistics , computer science , sensitivity (control systems) , physics , quantum mechanics , electronic engineering , engineering
In many medical and health studies, high‐dimensional data are often encountered. Principal component analysis (PCA) is a commonly used technique to reduce such data to a few components that includes most of the information provided by the original data. However, PCA is known to be very sensitive to some abnormal observations. Therefore, it is essential to assess such sensitivity in PCA. In this paper, the assessments of local influence based on generalized influence function are developed under the case‐weights and additive perturbation schemes, along with a discussion of the perturbation scheme and the generalized influence function approach. When perturbing different variables of the data, it is noted that the directions of the largest joint local influence for the eigenvalues are all the same. Moreover, these directions are completely determined by the score values of the observations, to which an approximate cut‐off point is given. The proposed methods are applied to analyse a set of haematology study data for illustration. Results add new insights in finding influential observations in the studied data set. Copyright © 2006 John Wiley & Sons, Ltd.