z-logo
Premium
On the strong consistency of feature‐weighted k ‐means clustering in a nearmetric space
Author(s) -
Chakraborty Saptarshi,
Das Swagatam
Publication year - 2019
Publication title -
stat
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.61
H-Index - 18
ISSN - 2049-1573
DOI - 10.1002/sta4.227
Subject(s) - cluster analysis , consistency (knowledge bases) , measure (data warehouse) , independent and identically distributed random variables , feature (linguistics) , key (lock) , computer science , function (biology) , mathematics , algorithm , data mining , sampling (signal processing) , feature vector , k means clustering , cluster (spacecraft) , space (punctuation) , pattern recognition (psychology) , artificial intelligence , statistics , random variable , biology , computer vision , programming language , operating system , linguistics , philosophy , computer security , filter (signal processing) , evolutionary biology
Weighted k ‐means ( W K ‐means) is a well‐known method for automated feature weight learning in a conventional k ‐means clustering framework. In this paper, we analytically explore the strong consistency of the W K ‐means algorithm under independent and identically distributed sampling of the data points. The choice of dissimilarity measure plays a key role in data partitioning and detecting the inherent groups existing in a dataset. We propose a proof of strong consistency of the W K ‐means algorithm when the dissimilarity measure used is assumed to be a nearmetric. The proof can be further extended to those dissimilarity measures which are an increasing function of a nearmetric. Through detailed experiments, we demonstrate that W K ‐means‐type algorithms, equipped with a nearmetric, can be pretty effective especially when some of the features are unimportant in revealing the cluster structure of the dataset.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here