Premium
Distance Metrics and Clustering Methods for Mixed‐type Data
Author(s) -
Foss Alexander H.,
Markatou Marianthi,
Ray Bonnie
Publication year - 2019
Publication title -
international statistical review
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.051
H-Index - 54
eISSN - 1751-5823
pISSN - 0306-7734
DOI - 10.1111/insr.12274
Subject(s) - cluster analysis , categorical variable , data mining , ordinal data , computer science , strengths and weaknesses , correlation clustering , consensus clustering , data type , cure data clustering algorithm , statistics , mathematics , artificial intelligence , machine learning , psychology , social psychology , programming language
Summary In spite of the abundance of clustering techniques and algorithms, clustering mixed interval (continuous) and categorical (nominal and/or ordinal) scale data remain a challenging problem. In order to identify the most effective approaches for clustering mixed‐type data, we use both theoretical and empirical analyses to present a critical review of the strengths and weaknesses of the methods identified in the literature. Guidelines on approaches to use under different scenarios are provided, along with potential directions for future research.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom