z-logo
open-access-imgOpen Access
Massively scalable density based clustering (DBSCAN) on the HPCC systems big data platform
Author(s) -
H R Yatish,
Shubham Milind Phal,
Tanmay Sanjay Hukkeri,
Lili Xu,
G Shobha,
Jyoti Shetty,
Arjuna Chala
Publication year - 2021
Publication title -
iaes international journal of artificial intelligence
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.341
H-Index - 7
eISSN - 2252-8938
pISSN - 2089-4872
DOI - 10.11591/ijai.v10.i1.pp207-214
Subject(s) - computer science , dbscan , scalability , cluster analysis , data mining , algorithm , parallel computing , cure data clustering algorithm , artificial intelligence , correlation clustering , database
Dealing with large samples of unlabeled data is a key challenge in today’s world, especially in applications such as traffic pattern analysis and disaster management. DBSCAN, or density based spatial clustering of applications with noise, is a well-known density-based clustering algorithm. Its key strengths lie in its capability to detect outliers and handle arbitrarily shaped clusters. However, the algorithm, being fundamentally sequential in nature, proves expensive and time consuming when operated on extensively large data chunks. This paper thus presents a novel implementation of a parallel and distributed DBSCAN algorithm on the HPCC Systems platform. The algorithm seeks to fully parallelize the algorithm implementation by making use of HPCC Systems optimal distributed architecture and performing a tree-based union to merge local clusters. The proposed approach* was tested both on synthetic as well as standard datasets (MFCCs Data Set) and found to be completely accurate. Additionally, when compared against a single node setup, a significant decrease in computation time was observed with no impact to accuracy. The parallelized algorithm performed eight times better for higher number of data points and takes exponentially lesser time as the number of data points increases.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here