Open Access
RETRACTED: Application research of data mining algorithm in big data environment
Author(s) -
Zijun Zhao
Publication year - 2019
Publication title -
journal of physics. conference series
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.21
H-Index - 85
eISSN - 1742-6596
pISSN - 1742-6588
DOI - 10.1088/1742-6596/1423/1/012045
Subject(s) - big data , computer science , data mining , adaptability , cluster analysis , field (mathematics) , node (physics) , process (computing) , algorithm , data stream mining , naive bayes classifier , scale (ratio) , map reduce , machine learning , engineering , mathematics , support vector machine , operating system , ecology , physics , structural engineering , quantum mechanics , pure mathematics , biology
The Hadoop platform forms a complete large-scale ecological distribution system, including HDFS, MapReduce, HBase and other subsystems. This paper analyzes the parallel processing of Hadoop platform and applies it in the field of data mining algorithms. In order to obtain better algorithm efficiency, a K-Modes clustering algorithm based on big data platform is proposed. It uses cluster mode to replace the central node. The mining process uses naive Bayes to improve mining efficiency. The experimental results show that it has better adaptability, saves time and improves the efficiency of the algorithm.