Premium
Density‐based clustering
Author(s) -
Campello Ricardo J. G. B.,
Kröger Peer,
Sander Jörg,
Zimek Arthur
Publication year - 2019
Publication title -
wiley interdisciplinary reviews: data mining and knowledge discovery
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.506
H-Index - 47
eISSN - 1942-4795
pISSN - 1942-4787
DOI - 10.1002/widm.1343
Subject(s) - cluster analysis , single linkage clustering , computer science , data mining , correlation clustering , cure data clustering algorithm , preprocessor , dbscan , pattern recognition (psychology) , outlier , hierarchical clustering , determining the number of clusters in a data set , data set , cluster (spacecraft) , consensus clustering , set (abstract data type) , artificial intelligence , programming language
Clustering refers to the task of identifying groups or clusters in a data set. In density‐based clustering , a cluster is a set of data objects spread in the data space over a contiguous region of high density of objects. Density‐based clusters are separated from each other by contiguous regions of low density of objects. Data objects located in low‐density regions are typically considered noise or outliers. In this review article we discuss the statistical notion of density‐based clusters, classic algorithms for deriving a flat partitioning of density‐based clusters, methods for hierarchical density‐based clustering, and methods for semi‐supervised clustering. We conclude with some open challenges related to density‐based clustering. This article is categorized under: Technologies > Data Preprocessing Ensemble Methods > Structure Discovery Algorithmic Development > Hierarchies and Trees