
Research and application of improved K-means based on MapReduce
Author(s) -
Hongqin Wang,
Hongxia Wang,
LiQing Jiang,
Zhengjun Pan
Publication year - 2020
Publication title -
journal of physics. conference series
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.21
H-Index - 85
eISSN - 1742-6596
pISSN - 1742-6588
DOI - 10.1088/1742-6596/1651/1/012074
Subject(s) - computer science , big data , scalability , cluster analysis , data mining , data processing , database , machine learning
With the development of big data, the traditional data mining clustering algorithm K-Means is inefficient and has poor scalability in dealing with massive data. MapReduce on the Hadoop platform was used to realize the parallel processing of the K-Means algorithm, the performance of the algorithm was tested by experiments. The results show that the improved K-Means algorithm has good parallel expansion capability, high efficiency, and great potential when processing big data mining. The algorithm is applied to the big data processing of customer consumption in a restaurant chain, and the effectiveness of the algorithm is verified, which can better serve the decision of restaurant.