z-logo
open-access-imgOpen Access
Redblock: a tool for online deduplication on large datasets
Author(s) -
Luan Félix Pimentel,
Igor Lemos Vicente,
Guilherme Dal Bianco
Publication year - 2017
Publication title -
revista brasileira de computação aplicada
Language(s) - English
Resource type - Journals
ISSN - 2176-6649
DOI - 10.5335/rbca.v9i2.7143
Subject(s) - data deduplication , computer science , process (computing) , data mining , database , operating system
Online data deduplication aims to identify records that represent the same purpose on a continuous data flow environment. It must be able to process a range of information with high effectiveness and no delays. The purpose of this paper is to introduce a developed tool entitled Redblock, for real-time data deduplication, using a distributed platform for online processing combined with an Inverted Index. During the experimental evaluation, Redblock managed to provide good preliminary results in terms of efficiency and effectiveness in a database.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom