z-logo
Premium
Open issues for partitioning clustering methods: an overview
Author(s) -
Barioni Maria Camila N.,
Razente Humberto,
Marcelino Alessandra M. R.,
Traina Agma J. M.,
Traina Caetano
Publication year - 2014
Publication title -
wiley interdisciplinary reviews: data mining and knowledge discovery
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.506
H-Index - 47
eISSN - 1942-4795
pISSN - 1942-4787
DOI - 10.1002/widm.1127
Subject(s) - cluster analysis , computer science , variety (cybernetics) , data science , scalability , data mining , knowledge extraction , machine learning , artificial intelligence , database
Over the last decades, a great variety of data mining techniques have been developed to reach goals concerning Knowledge Discovery in Databases. Among them, cluster detection techniques are of major importance. Although these techniques have already been largely explored in the scientific literature, there are at least two important open issues: the existent algorithms are not scalable for large high‐dimensional datasets, and the unsupervised nature of traditional data clustering makes it very difficult to generate meaningful clusters. This article presents an overview of the strategies being explored in order to deal more deeply with these issues. Moreover, it describes a new semi‐supervised clustering strategy that exemplifies the integration of several approaches and that can be employed with partitioning algorithms, such as PAM and Clarans. The technique addresses an improvement to these types of algorithms, which is obtained by using must‐link feedback information provided by the users in an interactive and visual environment. WIREs Data Mining Knowl Discov 2014, 4:161–177. doi: 10.1002/widm.1127 This article is categorized under: Technologies > Structure Discovery and Clustering

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here