z-logo
open-access-imgOpen Access
An Algebraic Approach Towards Data Cleaning
Author(s) -
Ridha Khédri,
Fei Chiang,
Khair Eddin Sabri
Publication year - 2013
Publication title -
procedia computer science
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.334
H-Index - 76
ISSN - 1877-0509
DOI - 10.1016/j.procs.2013.09.009
Subject(s) - computer science , association rule learning , set (abstract data type) , preprocessor , function (biology) , data pre processing , data set , data mining , association (psychology) , commodity , algebraic number , theoretical computer science , programming language , artificial intelligence , mathematical analysis , philosophy , mathematics , epistemology , evolutionary biology , biology , economics , market economy
There has been a proliferation in the amount of data being generated and collected in the past several years. One of the leading factors contributing to this increased data scale is cheaper commodity storage, making it easier for organisations to house large data stores containing massive amounts of historical data. To effectively analyse these data sets, a preprocessing step is often required as most real data sets are inherently dirty and inconsistent. Existing data cleaning tools have focused on cleaning the errors at hand. In this paper, we take a more formal approach and propose the use of information algebra as a general theory to describe structured data sets and data cleaning. We formally define the notion of association rule, association function, and we present results relating these concepts. We also propose an algorithm for generating association rules from a given structured data set

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom