z-logo
open-access-imgOpen Access
Errors Detection and Correction in Large Scale Data Collecting
Author(s) -
Renato Bruni,
Antonio Sassano
Publication year - 2001
Publication title -
lecture notes in computer science
Language(s) - English
Resource type - Book series
SCImago Journal Rank - 0.249
H-Index - 400
eISSN - 1611-3349
pISSN - 0302-9743
DOI - 10.1007/3-540-44816-0_9
Subject(s) - computer science , set (abstract data type) , sequence (biology) , algorithm , redundancy (engineering) , satisfiability , data mining , data set , range (aeronautics) , encoding (memory) , artificial intelligence , genetics , materials science , composite material , biology , programming language , operating system
The paper is concerned with the problem of automatic detection and correction of inconsistent or out of range data in a general process of statistical data collecting. Under such circumstances, errors are usually detected by formulating a set of rules which the data records must respect in order to be declared correct. As a first relevant point, the set of rules itself is checked for inconsistency or redundancy, by encoding it into a propositional logic formula, and solving a sequence of Satisfiability problems. This set of rules is then used to detect erroneous data. In the subsequent phase of error correction, the above set of rules must be satisfied, but the erroneous records should be altered as little as possible, and frequency distributions of correct data should be preserved. As a second relevant point, error correction is modeled by encoding the rules with linear inequalities, and solving a sequence of set covering problems. The proposed procedure is tested on a real-world case of Census.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom