A K-means Optimization Algorithm Suitable for Fast Clustering of WebGIS Massive Data | Zendy

Hao He | Zendy; Bo Sun | Zendy; Yan Yang | Zendy; Jun Chen | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

A K-means Optimization Algorithm Suitable for Fast Clustering of WebGIS Massive Data

Author(s) -

Hao He,

Bo Sun,

Yan Yang,

Jun Chen

Publication year - 2022

Publication title -

journal of physics. conference series

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.21

H-Index - 85

eISSN - 1742-6596

pISSN - 1742-6588

DOI - 10.1088/1742-6596/2171/1/012069

Subject(s) - cluster analysis , data mining , computer science , cure data clustering algorithm , correlation clustering , stability (learning theory) , grid , k medians clustering , data stream clustering , set (abstract data type) , canopy clustering algorithm , determining the number of clusters in a data set , cluster (spacecraft) , algorithm , mathematics , artificial intelligence , machine learning , geometry , programming language

K-means has the advantage of fast speed and is suitable for clustering large-scale data of WebGIS geographic information. However, due to the random selection of K-means initial clustering centers, the clustering results are unstable and the clustering accuracy is poor. Some current research documents have solved the problems of clustering accuracy and stability, but the clustering time has been greatly increased. The article proposes a grid-based K-means improved algorithm GBK-means, which is based on an adaptive grid method to obtain initial clustering centers. Firstly, the parameters of the grid division are obtained by judging the distribution state of the sample data; then, the interconnected areas of each dense grid are obtained and the cluster centers are obtained, and on this basis, the initial cluster centers are obtained. The experimental results on the real data set of WebGIS show that GBK-means has better clustering effect and faster clustering speed than K-means, K-means++, literature [2], and literature [3]. The average value of its F value, accuracy rate and adjusted Rand coefficient (ARI) is 10.9%, 11% and 11.2% higher than that of K-means. The average clustering time is 75.4%, 52.3%, 85.1%, 91.1% faster than K-means, K-means++, literature [2], and literature [3].

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore