z-logo
Premium
Model clustering and its application to water quality monitoring
Author(s) -
Zhu Rong,
ElShaarawi Abdel H.
Publication year - 2009
Publication title -
environmetrics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.68
H-Index - 58
eISSN - 1099-095X
pISSN - 1180-4009
DOI - 10.1002/env.922
Subject(s) - pairwise comparison , cluster analysis , similarity (geometry) , data mining , set (abstract data type) , computer science , partition (number theory) , artificial intelligence , mathematics , pattern recognition (psychology) , machine learning , image (mathematics) , combinatorics , programming language
The classification of objects into groups where the objects within a group share a set of common traits is important in many areas of applications and particularly in environmental pollution studies. Consider the situation where variables are measured on different occasions for each of K objects, and the objective is to classify these objects into groups according to some common characteristics. The procedure introduced in this paper consists of two aspects: model fitting and clustering. The model fitting selects a family of models which is appropriate for the structure and nature of the available measurements, and then is performed for both individual and pooled datasets. The clustering starts with K models that represent the K objects and thus the similarity of the objects reduces to the similarity of their models. Since the models are members of the same family, the models similarity is defined as the equality of their parameters of interest. Here, we partition the parameter vector into two sub‐vectors corresponding to the interested parameters and ancillary parameters. The clustering will group together objects that have common interested parameters while allowing the ancillary parameters to be object specifics. The p ‐value associated with the proposed model linking test is used as the similarity measure. Several grouping strategies are proposed like cluster peeling, pairwise combining, as well as a speeding technique called splitting‐and‐binding. A small simulation study is used to demonstrate the utility of the method. The paper concludes by presenting an environmental application where the interest is to classify E. coli bacteria according to their responses to antibiotic treatments. The data were collected bi‐weekly at several locations within three Canadian watersheds during 2005. Metric closeness in parameter space used by conventional method and likelihood closeness in model space employed by model clustering are discussed in this application. Copyright © 2008 John Wiley & Sons, Ltd.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here