Establishing value mappings using statistical models and user feedback
Author(s) -
Jaewoo Kang,
Tae Sik Han,
Dongwon Lee,
Prasenjit Mitra
Publication year - 2005
Publication title -
citeseer x (the pennsylvania state university)
Language(s) - English
Resource type - Conference proceedings
ISBN - 1-59593-140-6
DOI - 10.1145/1099554.1099569
Subject(s) - computer science , entropy (arrow of time) , matching (statistics) , data mining , value (mathematics) , similarity (geometry) , data modeling , algorithm , statistical model , theoretical computer science , artificial intelligence , machine learning , mathematics , statistics , image (mathematics) , physics , quantum mechanics , database
In this paper, we present a "value mapping" algorithm that does not rely on syntactic similarity or semantic interpretation of the values. The algorithm first constructs a statistical model (e.g., co-occurrence frequency or entropy vector) that captures the unique characteristics of values and their co-occurrence. It then finds the matching values by computing the distances between the models while refining the models using user feedback through iterations. Our experimental results suggest that our approach successfully establishes value mappings even in the presence of opaque data values and thus can be a useful addition to the existing data integration techniques.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom