z-logo
Premium
BRIDGING THE GAP BETWEEN DISTANCE AND GENERALIZATION
Author(s) -
Estruch V.,
Ferri C.,
HernándezOrallo J.,
RamírezQuintana M. J.
Publication year - 2014
Publication title -
computational intelligence
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.353
H-Index - 52
eISSN - 1467-8640
pISSN - 0824-7935
DOI - 10.1111/coin.12004
Subject(s) - generalization , generality , metric space , computer science , metric (unit) , space (punctuation) , mathematics , algorithm , minimum description length , representation (politics) , theoretical computer science , artificial intelligence , discrete mathematics , psychology , mathematical analysis , operations management , politics , law , economics , psychotherapist , operating system , political science
Distance‐based and generalization‐based methods are two families of artificial intelligence techniques that have been successfully used over a wide range of real‐world problems. In the first case, general algorithms can be applied to any data representation by just changing the distance. The metric space sets the search and learning space, which is generally instance‐oriented. In the second case, models can be obtained for a given pattern language, which can be comprehensible. The generality‐ordered space sets the search and learning space, which is generally model‐oriented. However, the concepts of distance and generalization clash in many different ways, especially when knowledge representation is complex (e.g., structured data). This work establishes a framework where these two fields can be integrated in a consistent way. We introduce the concept of distance‐based generalization, which connects all the generalized examples in such a way that all of them are reachable inside the generalization by using straight paths in the metric space. This makes the metric space and the generality‐ordered space coherent (or even dual). Additionally, we also introduce a definition of minimal distance‐based generalization that can be seen as the first formulation of the Minimum Description Length (MDL)/Minimum Message Length (MML) principle in terms of a distance function. We instantiate and develop the framework for the most common data representations and distances, where we show that consistent instances can be found for numerical data, nominal data, sets, lists, tuples, graphs, first‐order atoms, and clauses. As a result, general learning methods that integrate the best from distance‐based and generalization‐based methods can be defined and adapted to any specific problem by appropriately choosing the distance, the pattern language and the generalization operator.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom