Improve BIRCH algorithm for big data clustering | Zendy

Fanny Ramadhani | Zendy; Muhammad Zarlis | Zendy; Saib Suwilo | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Improve BIRCH algorithm for big data clustering

Author(s) -

Fanny Ramadhani,

Muhammad Zarlis,

Saib Suwilo

Publication year - 2020

Publication title -

iop conference series. materials science and engineering

Language(s) - English

Resource type - Journals

eISSN - 1757-899X

pISSN - 1757-8981

DOI - 10.1088/1757-899x/725/1/012090

Subject(s) - cluster analysis , algorithm , tree (set theory) , node (physics) , computer science , matching (statistics) , big data , mathematics , data mining , statistics , artificial intelligence , combinatorics , engineering , structural engineering

Big Data is a collection of data with super large data volumes, has a very high diversity of data sources, so needs to be managed with methods and devices that help perform accordingly. Clustering is one of the effective techniques for dealing with Big Data. The hierarchical method with the BIRCH algorithm is able to produce a short time in data execution. The BIRCH algorithm is a matching grouping algorithm for very large data sets. In an algorithm, a CF-tree is built in which all entries in each leaf node must meet same T threshold, and the CF-tree is rebuilt at each stage with a different threshold. But using a static (fixed) threshold produces poor cluster quality, in this paper proposes a solution to this deficiency by modifying the Threshold value to dynamic so that it can produce good cluster quality and be validated using silhouette coefficient (SC). There is a very clear difference between the standard BIRCH algorithm and the BIRCH algorithm on the modified T parameter (BIRCH (CF-Leaf (modif)). The CF-Node result, the total CF-Entries and Total CF-Leaf Entries produced 60% less than CF-Node, the total CF-Entries and Total CF-Leaf Entries in the standard BIRCH algorithm.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore