Premium
Model‐based clustering for spatiotemporal data on air quality monitoring
Author(s) -
Cheam A. S. M.,
Marbac M.,
McNicholas P. D.
Publication year - 2017
Publication title -
environmetrics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.68
H-Index - 58
eISSN - 1099-095X
pISSN - 1180-4009
DOI - 10.1002/env.2437
Subject(s) - cluster analysis , data mining , computer science , autoregressive model , mixture model , identifiability , expectation–maximization algorithm , information criteria , model selection , bayesian information criterion , air quality index , statistics , mathematics , maximum likelihood , artificial intelligence , machine learning , geography , meteorology
Data extracted from air quality monitoring can require spatiotemporal clustering techniques. Of late, many clustering techniques are based on mixture models; however, there is a shortage of model‐based approaches for spatiotemporal data. A new mixture to cluster spatiotemporal data, named STM, is introduced, and generic identifiability is proved. The resulting model defines each mixture component as a mixture of autoregressive polynomial regressions in which the weights consider the spatial and temporal information with logistic links. Under the maximum likelihood framework, parameter estimation is carried out via an expectation–maximization algorithm while classical information criteria can be used for model selection. The proposed model is applied to air quality monitoring data from the periphery of Paris considering one of the critical pollutants, nitrogen dioxide, at different times during the day. The STM model is implemented in the R package SpaTimeClust .