z-logo
open-access-imgOpen Access
Model-Based Clustering for Image Segmentation and Large Datasets via Sampling
Author(s) -
Ron Wehrens,
L.M.C. Buydens,
Chris Fraley,
Adrian E. Raftery
Publication year - 2004
Publication title -
journal of classification
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.657
H-Index - 40
eISSN - 1432-1343
pISSN - 0176-4268
DOI - 10.1007/s00357-004-0018-8
Subject(s) - cluster analysis , computer science , pattern recognition (psychology) , segmentation , set (abstract data type) , artificial intelligence , sampling (signal processing) , data mining , sample (material) , remainder , simple (philosophy) , data set , simple random sample , multispectral image , mathematics , computer vision , filter (signal processing) , population , philosophy , chemistry , demography , arithmetic , chromatography , epistemology , sociology , programming language
The rapid increase in the size of data sets makes clustering all the more impor- tant to capture and summarize the information, at the same time making clustering more difficult to accomplish. If model-based clustering is applied directly to a larg e data set, it can be too slow for practical application. A simple and common approach is to first cluster a random sample of moderate size, and then use the clustering model found in this way to classify the remainder of the objects. We show that, in its simplest form, this method may lead to unstable results. Our experiments suggest that a stable method with better per- formance can be obtained with two straightforward modifications to the simple sampling method: several tentative models are identified from the sample instead of just one, and several EM steps are used rather than just one E step to classify the full d ata set. We find that there are significant gains from increasing the size of the sample up to about 2,000, but not from further increases. These conclusions are based on the application of several alternative strategies to the segmentation of three different multispectral im ages, and to several simulated data sets.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom