Open Access
Comparison of Multivariate Outlier Detection Methods for Nearly Elliptical Distributions
Author(s) -
Kazumi Wada,
Mariko Kawano,
Hiroe Tsubaki
Publication year - 2020
Publication title -
österreichische zeitschrift für statistik
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.342
H-Index - 9
ISSN - 1026-597X
DOI - 10.17713/ajs.v49i2.872
Subject(s) - outlier , estimator , anomaly detection , multivariate statistics , computer science , robust statistics , variance (accounting) , elliptical distribution , skew , covariance , monte carlo method , statistics , data mining , algorithm , multivariate normal distribution , mathematics , accounting , telecommunications , business
In this paper, the performance of outlier detection methods has been evaluated with symmetrically distributed datasets. We choose four estimators, viz. modified Stahel-Donoho (MSD) estimators, blocked adaptive computationally efficient outlier nominators, minimum covariance determinant estimator obtained by a fast algorithm, and nearest-neighbour variance estimator, which are known for their good performance with elliptically distributed data, for practical applications in national survey data processing. We adopt the data model of multivariate skew-t distribution, of which only the direction of the main axis is skewed and contaminated with outliers following another probability distribution for evaluation. We conducted Monte Carlo simulation under the data distribution to compare the performance of outlier detection. We also explore the applicability of the selected methods for several accounting items in small and medium enterprise survey data. Accordingly, it was found that the MSD estimators are the most suitable.