z-logo
open-access-imgOpen Access
New algorithm for clustering unlabeled big data
Author(s) -
Marwan B. Mohammed,
Wafaa Al-Hameed
Publication year - 2021
Publication title -
indonesian journal of electrical engineering and computer science
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.241
H-Index - 17
eISSN - 2502-4760
pISSN - 2502-4752
DOI - 10.11591/ijeecs.v24.i2.pp1054-1062
Subject(s) - cluster analysis , computer science , canopy clustering algorithm , data mining , sentence , cure data clustering algorithm , word (group theory) , process (computing) , data stream clustering , correlation clustering , artificial intelligence , algorithm , pattern recognition (psychology) , mathematics , geometry , operating system
The clustering analysis techniques play an important role in the area of data mining. Although from existence several clustering techniques. However, it still to their tries to improve the clustering process efficiently or propose new techniques seeks to allocate objects into clusters so that two objects in the same cluster are more similar than two objects in different clusters and careful not to duplicate the same objects in different groups with the ability to cover all data as much as possible. This paper presents two directions. The first is to propose a new algorithm that coined a name (MB Algorithm) to collect unlabeled data and put them into appropriate groups. The second is the creation of a lexical sequence sentence (LCS) based on similar semantic sentences which are different from the traditional lexical word chain (LCW) based on words. The results showed that the performance of the MB algorithm has generally outperformed the two algorithms the hierarchical clustering algorithm and the K-mean algorithm.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here