z-logo
open-access-imgOpen Access
Learning Distance Metrics for Entity Resolution
Author(s) -
Lingli Li,
Xiaodan Shang,
Jinbao Li,
Jin Hu
Publication year - 2018
Publication title -
ieee access
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.587
H-Index - 127
ISSN - 2169-3536
DOI - 10.1109/access.2018.2871168
Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation
Entity resolution (ER) is to find database records that refer to the same real-world entity. A key component for ER is to choose a proper distance (similarity) function for each database field to quantify the similarity of records. Most existing ER approaches focus on how to define a proper matching rule based on generic or hand-crafted distance metrics. In this paper, we explore two learnable string distance metrics for two kinds of ER problems by employing the principle component analysis and the largest margin nearest neighbor algorithm for training. Experimental results on real data sets show that our approaches can improve entity resolution accuracy over traditional techniques.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom