
Research on improvement of high utility pattern mining algorithm over data streams
Author(s) -
Feng Guo,
Yuqiang Li,
Lin Li
Publication year - 2020
Publication title -
iop conference series. materials science and engineering
Language(s) - English
Resource type - Journals
eISSN - 1757-899X
pISSN - 1757-8981
DOI - 10.1088/1757-899x/715/1/012022
Subject(s) - header , computer science , data mining , table (database) , data stream mining , tree (set theory) , database transaction , data stream , algorithm , database , mathematics , mathematical analysis , computer network , telecommunications
Aiming at the problem that the existing algorithms for high utility pattern mining over data streams based on sliding window have multiple datasets scans or redundant items, an efficient HUIGRT algorithm for mining high utility patterns over data streams based on global revision header table is proposed in this paper. First, the global revision header table and the utility tree are constructed. The global revision header table is used to store the items and transaction utility of the current data domain that need to be processed, and the utility tree is used to store all of the utility information on the item sets in the transactions to avoid multiple datasets scans. Then, this algorithm can mine all high utility patterns using the global revision header table and the utility tree. Finally, the redundant items are deleted by revising the global revision header table, meanwhile the utility tree is updated to fill in new data. This paper compares the algorithm with the existing high efficiency algorithm HUPMS and HUM-UT on the three datasets with different sparse: Mushroom, T10.I4.D100K and Retail. The results show that the space-time performance of HUIGRT algorithm is better than the two other algorithms.