Premium
Mining Short‐Rule Covers in Relational Databases
Author(s) -
Carpineto Claudio,
Romano Giovanni
Publication year - 2003
Publication title -
computational intelligence
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.353
H-Index - 52
eISSN - 1467-8640
pISSN - 0824-7935
DOI - 10.1111/1467-8640.00222
Subject(s) - computer science , data mining , hash function , database , set (abstract data type) , attribute domain , relational database , object (grammar) , rough set , function (biology) , mathematics , algorithm , artificial intelligence , computer security , evolutionary biology , biology , programming language
An implication rule Q → R is a statement of the form “for all objects in the database, if an object has the attribute–value pairs Q then it has also the attribute–value pairs R .” This simple type of rule is theoretically interesting, because it supports reasoning, similar to functional dependencies in database theory, and it may be of practical significance because the size of the set of implication rules that hold in a relation can remain substantially high even when mining real data and considering only most general covers; i.e., covers containing rules with unredundant right and left sizes. Motivated by these observations, we focus on the extraction of short‐rule covers, which cannot be efficiently mined by standard rule miners. We present an algorithm driven by “negative examples” (i.e., satisfy Q but not R ) to prune the rule‐candidate lattice associated with each “positive example” (i.e., satisfies both Q and R ). The algorithm scales up quite well with respect to the number of objects and it is particularly suitable for databases with attributes described by large domains. Furthermore, a perfect hash function ensures extraction of short‐rule covers even from databases containing a large number of attributes.