Using an MST based Value for ε in DBSCAN Algorithm for Obtaining Better Result
Author(s) -
Nirmalya Chowdhury,
Preetha Bhattacharjee
Publication year - 2014
Publication title -
international journal of information technology and computer science
Language(s) - English
Resource type - Journals
eISSN - 2074-9015
pISSN - 2074-9007
DOI - 10.5815/ijitcs.2014.06.08
Subject(s) - computer science , value (mathematics) , dbscan , algorithm , artificial intelligence , machine learning , cluster analysis , correlation clustering , canopy clustering algorithm
In this paper, an objective function based on minimal spanning tree (MST) of data points is proposed for clustering and a density-based clustering technique has been used in an attempt to optimize the specified objective function in order to detect the ―natural grouping‖ present in a given data set. A threshold based on MST of data points of each cluster thus found is used to remove noise (if any present in the data) from the final clustering. A comparison of the experimental results obtained by DBSCAN (Density Based Spatial Clustering of Applications with Noise) algorithm and the proposed algorithm has also been incorporated. It is observed that our proposed algorithm performs better than DBSCAN algorithm. Several experiments on synthetic data set in and show the utility of the proposed method. The proposed method has also found to provide good results for two real life data sets considered for experimentation. Note that -means is one of the most popular methods adopted to solve the clustering problem. This algorithm uses an objective function that is based on minimization of squared error criteria. Note that it may not always provide the ―natural grouping‖ though it is useful in many applications.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom