z-logo
open-access-imgOpen Access
Discovering Top-k Probabilistic Frequent Itemsets from Uncertain Databases
Author(s) -
Haifeng Li,
Yuejin Zhang,
Ning Zhang
Publication year - 2017
Publication title -
procedia computer science
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.334
H-Index - 76
ISSN - 1877-0509
DOI - 10.1016/j.procs.2017.11.482
Subject(s) - probabilistic logic , computer science , data mining , probabilistic database , uncertain data , focus (optics) , association rule learning , function (biology) , database , artificial intelligence , relational database , database theory , physics , evolutionary biology , biology , optics
Probabilistic frequent itemset mining is to find the itemsets with support larger than the threshold with a given probabilistic confidence within an uncertain database. Nevertheless, when the threshold is smaller, the mining results will be massive, which are not easy to understand by the users. In this paper, we focus on this problem and propose a method to achieve the top-k probabilistic frequent itemsets, which, to our best knowledge, has never been addressed before. A scoring function is defined to evaluate the level of itemsets. We introduce a compacted data structure, named TopKPFITree, to maintain the mining results and some other information. Furthermore, an efficient algorithm TopKPFIM is proposed to build the TopKPFITree and get the results. Our experimental results over uncertain datasets show that our algorithm significantly outperform the Naive algorithm.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom