Premium
On the number of components in a Gaussian mixture model
Author(s) -
McLachlan Geoffrey J.,
Rathnayake Suren
Publication year - 2014
Publication title -
wiley interdisciplinary reviews: data mining and knowledge discovery
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.506
H-Index - 47
eISSN - 1942-4795
pISSN - 1942-4787
DOI - 10.1002/widm.1135
Subject(s) - mixture model , cluster analysis , computer science , probabilistic logic , mixture distribution , statistical model , expectation–maximization algorithm , artificial intelligence , kernel (algebra) , gaussian , data mining , machine learning , pattern recognition (psychology) , mathematics , statistics , probability density function , maximum likelihood , chemistry , computational chemistry , combinatorics
Mixture distributions, in particular normal mixtures, are applied to data with two main purposes in mind. One is to provide an appealing semiparametric framework in which to model unknown distributional shapes, as an alternative to, say, the kernel density method. The other is to use the mixture model to provide a probabilistic clustering of the data into g clusters corresponding to the g components in the mixture model. In both situations, there is the question of how many components to include in the normal mixture model. We review various methods that have been proposed to answer this question. WIREs Data Mining Knowl Discov 2014, 4:341–355. doi: 10.1002/widm.1135 This article is categorized under: Technologies > Machine Learning