Algorithms Based on Dynamic Minimum Probabilistic Support and Utility Thresholds for Mining Top-K High-Utility Itemsets from Uncertain Databases
Author(s) -
Khoi Nguyen,
Thien Nguyen
Publication year - 2025
Publication title -
ieee access
Language(s) - English
Resource type - Magazines
SCImago Journal Rank - 0.587
H-Index - 127
eISSN - 2169-3536
DOI - 10.1109/access.2025.3612006
Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation
Mining high-utility itemsets (HUIs) from large, uncertain databases is challenging due to vast data volumes and an extensive search space. Determining the minimum utility and probabilistic support for all items is also a time-intensive task. Setting these thresholds too high results in too few itemsets, while setting them too low leads to excessive itemsets and high computational cost. To address these issues, we developed solutions for mining Top-K high-utility itemsets from uncertain databases. First, we formulate the problem of mining Top-K high-utility itemsets from uncertain databases. Based on the problem statement, we propose four algorithms: FTKHUUIM+, ITUHUFP, TUHUFP, and TKUU to meet the requirements effectively. These algorithms utilize automated threshold-raising strategies and specialized storage structures to ensure optimal performance. Experimental results on public benchmark datasets show scalability and robust performance across both dense and sparse datasets of the algorithms in terms of runtime and efficiency. They all share the common feature of pruning candidate itemsets based on dynamic minimum probabilistic support and utility thresholds. Although their pruning strategies differ, all algorithms follow a recursive process to select Top-K HUIs by pairing items from the database and filtering patterns based on the thresholds. Among the proposed algorithms, the ITUHUFP algorithm demonstrates the best performance and stability in various benchmark datasets.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom