Premium
Outlier detection for skewed data
Author(s) -
Hubert Mia,
Van der Veeken Stephan
Publication year - 2008
Publication title -
journal of chemometrics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.47
H-Index - 92
eISSN - 1099-128X
pISSN - 0886-9383
DOI - 10.1002/cem.1123
Subject(s) - outlier , skewness , anomaly detection , measure (data warehouse) , univariate , bivariate analysis , bivariate data , plot (graphics) , computer science , generalization , pattern recognition (psychology) , mathematics , artificial intelligence , data mining , statistics , multivariate statistics , mathematical analysis
Most outlier detection rules for multivariate data are based on the assumption of elliptical symmetry of the underlying distribution. We propose an outlier detection method which does not need the assumption of symmetry and does not rely on visual inspection. Our method is a generalization of the Stahel–Donoho outlyingness. The latter approach assigns to each observation a measure of outlyingness, which is obtained by projection pursuit techniques that only use univariate robust measures of location and scale. To allow skewness in the data, we adjust this measure of outlyingness by using a robust measure of skewness as well. The observations corresponding to an outlying value of the adjusted outlyingness (AO) are then considered as outliers. For bivariate data, our approach leads to two graphical representations. The first one is a contour plot of the AO values. We also construct an extension of the boxplot for bivariate data, in the spirit of the bagplot 1 which is based on the concept of half space depth. We illustrate our outlier detection method on several simulated and real data. Copyright © 2008 John Wiley & Sons, Ltd.