z-logo
open-access-imgOpen Access
An initial clustering center optimization method based on neighbourhood density for K-means
Author(s) -
Mingxue Luo,
Yingchun Yuan,
Kejian Wang
Publication year - 2021
Publication title -
journal of physics. conference series
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.21
H-Index - 85
eISSN - 1742-6596
pISSN - 1742-6588
DOI - 10.1088/1742-6596/1748/3/032016
Subject(s) - cluster analysis , grid , k medians clustering , algorithm , merge (version control) , cure data clustering algorithm , computer science , outlier , correlation clustering , mathematics , data mining , artificial intelligence , geometry , information retrieval
The selection of initial clustering centers of traditional K-means algorithm is random and sensitive to outliers, which leads to unstable clustering results and low accuracy. To solve the above problems, the NDK-means algorithm based on neighbourhood density is proposed. Firstly, the grid distribution characteristics of samples are obtained by multi-dimensional grid division. Then, by defining the grid density and the grid neighbourhood density, several local high-density grids are determined. At the same time, the iteration factor is introduced to merge adjacent high-density grids to obtain a candidate set of initial clustering centers. Finally, combined with grid density and distance, K initial clustering centers are obtained by using Max-Min-distance algorithm. Experiments on UCI dataset show that compared with k-means algorithm and literature algorithm, the accuracy of NDK-means algorithm is improved by nearly 11% and 4%, and the iteration speed is improved by 70% and 60% respectively. The algorithm improves the accuracy of clustering and the results have good stability.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here