z-logo
open-access-imgOpen Access
A Novel Approach for Outlier Detection in Multivariate Data
Author(s) -
Saima Afzal,
Ayesha Afzal,
Muhammad Amin,
Sehar Saleem,
Nouman Ali,
Muhammad Sajid
Publication year - 2021
Publication title -
mathematical problems in engineering
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.262
H-Index - 62
eISSN - 1026-7077
pISSN - 1024-123X
DOI - 10.1155/2021/1899225
Subject(s) - outlier , principal component analysis , anomaly detection , multivariate statistics , measure (data warehouse) , pattern recognition (psychology) , data mining , computer science , dimensionality reduction , dimension (graph theory) , artificial intelligence , sample (material) , mathematics , statistics , machine learning , pure mathematics , chemistry , chromatography
Outlier detection is a challenging task especially when outliers are defined by rare combinations of multiple variables. In this paper, we develop and evaluate a new method for the detection of outliers in multivariate data that relies on Principal Components Analysis (PCA) and three-sigma limits. The proposed approach employs PCA to effectively perform dimension reduction by regenerating variables, i.e., fitted points from the original observations. The observations lying outside the three-sigma limits are identified as the outliers. This proposed method has been successfully employed to two real life and several artificially generated datasets. The performance of the proposed method is compared with some of the existing methods using different performance evaluation criteria including the percentage of correct classification, precision, recall, and F-measure. The supremacy of the proposed method is confirmed by abovementioned criteria and datasets. The F-measure for the first real life dataset is the highest, i.e., 0.6667 for the proposed method and 0.3333 and 0.4000 for the two existing approaches. Similarly, for the second real dataset, this measure is 0.8000 for the proposed approach and 0.5263 and 0.6315 for the two existing approaches. It is also observed by the simulation experiments that the performance of the proposed approach got better with increasing sample size.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom