Open Access
Evaluation of Subspace Clustering Using Internal Validity Measures
Author(s) -
Mariusz Oszust,
M. Kostka
Publication year - 2015
Publication title -
advances in electrical and computer engineering
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.254
H-Index - 23
eISSN - 1844-7600
pISSN - 1582-7445
DOI - 10.4316/aece.2015.03020
Subject(s) - cluster analysis , internal validity , computer science , data mining , subspace topology , statistics , artificial intelligence , mathematics
Different clustering algorithms, or even the same algorithm with different input parameters, can produce different data partitioning. Then, clustering validity measures are applied in order to determine which results have better quality than others. External measures can be used for evaluation of clustering algorithms on datasets with known data division. However, in a real scenario such information is not available, and here internal measures are often applied. Subspace clustering techniques can create clusters which utilise different subsets of the full feature space. From this reason, a calculation of internal measures using the full feature space distance metrics (e.g., Euclidean distance) is not justified. In this paper, we propose a novel approach to subspace clustering evaluation with internal quality measures, i.e., we apply distance metrics that are able to handle missing attribute values or are used in dimensionality reduction techniques. Our approach is verified on eight publicly available, widely-used datasets. Obtained results are promising and allow recommending proposed distance metrics to be suitable for calculation of examined internal validation measures