Premium
ADCN: An anisotropic density‐based clustering algorithm for discovering spatial point patterns with noise
Author(s) -
Mai Gengchen,
Janowicz Krzysztof,
Hu Yingjie,
Gao Song
Publication year - 2018
Publication title -
transactions in gis
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.721
H-Index - 63
eISSN - 1467-9671
pISSN - 1361-1682
DOI - 10.1111/tgis.12313
Subject(s) - dbscan , cluster analysis , computer science , noise (video) , algorithm , point (geometry) , data point , pattern recognition (psychology) , data mining , artificial intelligence , cure data clustering algorithm , correlation clustering , mathematics , image (mathematics) , geometry
Density‐based clustering algorithms such as DBSCAN have been widely used for spatial knowledge discovery as they offer several key advantages compared with other clustering algorithms. They can discover clusters with arbitrary shapes, are robust to noise, and do not require prior knowledge (or estimation) of the number of clusters. The idea of using a scan circle centered at each point with a search radius Eps to find at least MinPts points as a criterion for deriving local density is easily understandable and sufficient for exploring isotropic spatial point patterns. However, there are many cases that cannot be adequately captured this way, particularly if they involve linear features or shapes with a continuously changing density, such as a spiral. In such cases, DBSCAN tends to either create an increasing number of small clusters or add noise points into large clusters. Therefore, in this article, we propose a novel anisotropic density‐based clustering algorithm (ADCN). To motivate our work, we introduce synthetic and real‐world cases that cannot be handled sufficiently by DBSCAN (or OPTICS). We then present our clustering algorithm and test it with a wide range of cases. We demonstrate that our algorithm can perform equally as well as DBSCAN in cases that do not benefit explicitly from an anisotropic perspective, and that it outperforms DBSCAN in cases that do. Finally, we show that our approach has the same time complexity as DBSCAN and OPTICS, namely O( n log n ) when using a spatial index and O( n 2 ) otherwise. We provide an implementation and test the runtime over multiple cases.