Premium
A NOTE ON PROCEDURES FOR TESTING THE QUALITY OF A CLUSTERING OF A SET OF OBJECTS
Author(s) -
Milligan Glenn W.,
Mahajan Vijay
Publication year - 1980
Publication title -
decision sciences
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.238
H-Index - 108
eISSN - 1540-5915
pISSN - 0011-7315
DOI - 10.1111/j.1540-5915.1980.tb01168.x
Subject(s) - cluster analysis , computer science , data mining , set (abstract data type) , index (typography) , measure (data warehouse) , statistical hypothesis testing , quality (philosophy) , point (geometry) , cluster (spacecraft) , sampling (signal processing) , econometrics , statistics , mathematics , machine learning , computer vision , philosophy , geometry , epistemology , filter (signal processing) , world wide web , programming language
Despite the increased application of cluster analysis in decision sciences, few attempts have been made to derive hypothesis‐testing procedures for the evaluation of clustering solutions. In fact, the present paper shows that at least one such attempt failed to specify a meaningful sampling distribution for the test procedure. An alternative index based on the concept of point‐biserial correlation is proposed as a possible recovery measure. The index is subsequently used to form the basis of a valid statistical test for the existence of cluster structure.