z-logo
Premium
CBDL: Context‐based distance learning for categorical attributes
Author(s) -
Khorshidpour Zeinab,
Hashemi Sattar,
Hamzeh Ali
Publication year - 2011
Publication title -
international journal of intelligent systems
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.291
H-Index - 87
eISSN - 1098-111X
pISSN - 0884-8173
DOI - 10.1002/int.20499
Subject(s) - categorical variable , context (archaeology) , computer science , artificial intelligence , machine learning , metric (unit) , similarity (geometry) , component (thermodynamics) , feature (linguistics) , distance measures , unsupervised learning , data mining , pattern recognition (psychology) , paleontology , linguistics , operations management , physics , philosophy , economics , image (mathematics) , biology , thermodynamics
Distance learning is an important notion and has played a critical role in success of various machine learning algorithms. Any learning algorithm that requires dissimilarity/similarity measures has to assume some forms of distance functions, either explicitly or implicitly. Hence, in recent years a considerable amount of research has been devoted to distance learning. Despite great achievements in this field, a number of important issues need to be further explored for real world datasets mainly containing categorical attributes. Based on these considerations, the current research presents a Context‐Based Distance Learning approach (CBDL) to advance the state of the art existing researches on distance metric learning for categorical datasets. CBDL is designed and developed based on the idea that distance between two values of a given categorical attribute can be estimated by using information inherently exists within subset of attributes called context. CBDL composes of two main components: context extraction component and distance learning component. Context extraction component is responsible for extracting the relevant subset of feature set for a given attribute, while distance learning component tries to learn distance between each pair of values based on the extracted context. To have a comprehensive analysis, we conduct wide range of experiments in both supervised and unsupervised environments in the presence of noise. Our experimental results reveal that CBDL is the method of choice distance learning approach by offering a comparable or better performance compared to the state of the art existing distance learning schemes according to studied evaluation measures. © 2011 Wiley Periodicals, Inc.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here