Premium
Model‐based clustering of regression time series data via APECM—an AECM algorithm sung to an even faster beat
Author(s) -
Chen WeiChen,
Maitra Ranjan
Publication year - 2011
Publication title -
statistical analysis and data mining: the asa data science journal
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.381
H-Index - 33
eISSN - 1932-1872
pISSN - 1932-1864
DOI - 10.1002/sam.10143
Subject(s) - autoregressive model , cluster analysis , expectation–maximization algorithm , maximization , computation , time series , computer science , series (stratigraphy) , mixture model , gaussian , algorithm , artificial intelligence , mathematics , machine learning , mathematical optimization , statistics , maximum likelihood , paleontology , physics , biology , quantum mechanics
We propose a model‐based approach for clustering time series regression data in an unsupervised machine learning framework to identify groups under the assumption that each mixture component follows a Gaussian autoregressive regression model of order p . Given the number of groups, the traditional maximum likelihood approach of estimating the parameters using the expectation‐maximization (EM) algorithm can be employed, although it is computationally demanding. The somewhat fast “tune” to the EM “folk song” provided by the Alternating Expectation Conditional Maximization (AECM) algorithm can alleviate the problem to some extent. In this article, we develop an alternative partial expectation conditional maximization algorithm (APECM) that uses an additional data augmentation storage step to efficiently implement AECM for finite mixture models. Results on our simulation experiments show improved performance in both fewer numbers of iterations and computation time. The methodology is applied to the problem of clustering mutual funds data on the basis of their average annual per cent returns and in the presence of economic indicators. © 2011 Wiley Periodicals, Inc. Statistical Analysis and Data Mining 4: 567–578, 2011