
An initial clustering center optimization method based on neighbourhood density for K-means
Author(s) -
Mingxue Luo,
Yingchun Yuan,
Kejian Wang
Publication year - 2021
Publication title -
journal of physics. conference series
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.21
H-Index - 85
eISSN - 1742-6596
pISSN - 1742-6588
DOI - 10.1088/1742-6596/1748/3/032016
Subject(s) - cluster analysis , grid , k medians clustering , algorithm , merge (version control) , cure data clustering algorithm , computer science , outlier , correlation clustering , mathematics , data mining , artificial intelligence , geometry , information retrieval
The selection of initial clustering centers of traditional K-means algorithm is random and sensitive to outliers, which leads to unstable clustering results and low accuracy. To solve the above problems, the NDK-means algorithm based on neighbourhood density is proposed. Firstly, the grid distribution characteristics of samples are obtained by multi-dimensional grid division. Then, by defining the grid density and the grid neighbourhood density, several local high-density grids are determined. At the same time, the iteration factor is introduced to merge adjacent high-density grids to obtain a candidate set of initial clustering centers. Finally, combined with grid density and distance, K initial clustering centers are obtained by using Max-Min-distance algorithm. Experiments on UCI dataset show that compared with k-means algorithm and literature algorithm, the accuracy of NDK-means algorithm is improved by nearly 11% and 4%, and the iteration speed is improved by 70% and 60% respectively. The algorithm improves the accuracy of clustering and the results have good stability.