z-logo
open-access-imgOpen Access
A distributed near-optimal LSH-based framework for privacy-preserving record linkage
Author(s) -
Dimitrios Karapiperis,
Vassilios S. Verykios
Publication year - 2014
Publication title -
computer science and information systems
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.244
H-Index - 24
eISSN - 2406-1018
pISSN - 1820-0214
DOI - 10.2298/csis140215040k
Subject(s) - computer science , computation , overhead (engineering) , locality , hash function , locality sensitive hashing , volume (thermodynamics) , distributed computing , order (exchange) , hash table , theoretical computer science , computer security , algorithm , operating system , linguistics , philosophy , physics , finance , quantum mechanics , economics
In this paper, we present a framework which relies on the Map/Reduce paradigm in order to distribute computations among underutilized commodity hardware resources uniformly, without imposing an extra overhead on the existing infrastructure. The volume of the distance computations, required for records comparison, is largely reduced by utilizing the so-called Locality-Sensitive Hashing technique, which is optimally tuned in order to avoid highly redundant computations. Experimental results illustrate the effectiveness of our distributed framework in finding the matched record pairs in voluminous data sets.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom