Discovering Top-k Probabilistic Frequent Itemsets from Uncertain Databases | Zendy

Haifeng Li | Zendy; Yuejin Zhang | Zendy; Ning Zhang | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Discovering Top-k Probabilistic Frequent Itemsets from Uncertain Databases

Author(s) -

Haifeng Li,

Yuejin Zhang,

Ning Zhang

Publication year - 2017

Publication title -

procedia computer science

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.334

H-Index - 76

ISSN - 1877-0509

DOI - 10.1016/j.procs.2017.11.482

Subject(s) - probabilistic logic , computer science , data mining , probabilistic database , uncertain data , focus (optics) , association rule learning , function (biology) , database , artificial intelligence , relational database , database theory , physics , evolutionary biology , biology , optics

Probabilistic frequent itemset mining is to find the itemsets with support larger than the threshold with a given probabilistic confidence within an uncertain database. Nevertheless, when the threshold is smaller, the mining results will be massive, which are not easy to understand by the users. In this paper, we focus on this problem and propose a method to achieve the top-k probabilistic frequent itemsets, which, to our best knowledge, has never been addressed before. A scoring function is defined to evaluate the level of itemsets. We introduce a compacted data structure, named TopKPFITree, to maintain the mining results and some other information. Furthermore, an efficient algorithm TopKPFIM is proposed to build the TopKPFITree and get the results. Our experimental results over uncertain datasets show that our algorithm significantly outperform the Naive algorithm.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research