z-logo
Premium
Outlier detection for multivariate categorical data
Author(s) -
Puig Xavier,
Ginebra Josep
Publication year - 2018
Publication title -
quality and reliability engineering international
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.913
H-Index - 62
eISSN - 1099-1638
pISSN - 0748-8017
DOI - 10.1002/qre.2339
Subject(s) - categorical variable , outlier , row , contingency table , multinomial distribution , multivariate statistics , computer science , skewness , statistics , data mining , artificial intelligence , mathematics , database
The detection of outlying rows in a contingency table is tackled from a Bayesian perspective, by adapting the framework adopted by Box and Tiao for normal models to multinomial models with random effects. The solution assumes a 2–component mixture model of 2 multinomial continuous mixtures for them, one for the nonoutlier rows and the second one for the outlier rows. The method starts by estimating the distributional characteristics of nonoutlier rows, and then it does cluster analysis to identify which rows belong to the outlier group and which do not. The method applies to any type of contingency table, and in particular, it could be used on the analysis of multivariate categorical control charts. Here, the use of the method is illustrated through a simulated example and by applying it to help identify heterogeneities of style among the acts in the plays of the First Folio edition of Shakespeare drama.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here