z-logo
open-access-imgOpen Access
Associations among similarity and distance measures for binary data in cluster analysis
Author(s) -
Jana Cibulková,
Zdeněk Šulc,
Hana Řezanková,
С. М. Сирота
Publication year - 2020
Publication title -
metodološki zvezki
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.127
H-Index - 7
eISSN - 1854-0031
pISSN - 1854-0023
DOI - 10.51936/yelx5179
Subject(s) - similarity (geometry) , cluster analysis , hierarchical clustering , computer science , binary number , data mining , binary data , cluster (spacecraft) , complete linkage clustering , object (grammar) , consensus clustering , process (computing) , fuzzy clustering , artificial intelligence , mathematics , cure data clustering algorithm , image (mathematics) , arithmetic , programming language , operating system
The paper focuses on similarity and distance measures for binary data and their application in cluster analysis. There are 66 measures for binary data analyzed in the paper in order to provide a comprehensive insight into the problematics and to create their well-arranged overview. For this purpose, formulas by which they were defined are studied. In the next part of the research, the results of object clustering on generated datasets are compared, and the ability of measures to create similar or identical clustering solutions is evaluated. This is done by using chosen internal and external evaluation criteria, and comparing the assignments of objects into clusters in the process of hierarchical clustering. The paper shows which similarity measures and distance measures for binary data lead to similar or even identical results in hierarchical cluster analysis.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here