z-logo
Premium
A one‐class peeling method for multivariate outlier detection with applications in phase I SPC
Author(s) -
Martinez Waldyn G.,
Weese Maria L.,
Jones-Farmer L. Allison
Publication year - 2020
Publication title -
quality and reliability engineering international
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.913
H-Index - 62
eISSN - 1099-1638
pISSN - 0748-8017
DOI - 10.1002/qre.2629
Subject(s) - outlier , computer science , anomaly detection , data mining , multivariate statistics , covariance , data set , statistical process control , set (abstract data type) , dimension (graph theory) , process (computing) , covariance matrix , control chart , stability (learning theory) , class (philosophy) , artificial intelligence , machine learning , statistics , algorithm , mathematics , pure mathematics , programming language , operating system
In phase I of statistical process control (SPC), control charts are often used as outlier detection methods to assess process stability. Many of these methods require estimation of the covariance matrix, are computationally infeasible, or have not been studied when the dimension of the data, p , is large. We propose the one‐class peeling (OCP) method, a flexible framework that combines statistical and machine learning methods to detect multiple outliers in multivariate data. The OCP method can be applied to phase I of SPC, does not require covariance estimation, and is well suited to high‐dimensional data sets with a high percentage of outliers. Our empirical evaluation suggests that the OCP method performs well in high dimensions and is computationally more efficient and robust than existing methodologies. We motivate and illustrate the use of the OCP method in a phase I SPC application on a N = 354 , p = 1917 dimensional data set containing Wikipedia search results for National Football League (NFL) players, teams, coaches, and managers. The example data set and R functions, OCP.R and OCPLimit.R , to compute the respective OCP distances and thresholds are available in the supplementary materials.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here