Open Access
An Efficient IFP - Tree Based High Utility Pattern Mining of Itemsets with Indexing
Author(s) -
Sravani Pragna. K,
S. Dhivya,
A. Ragini,
P. Preethi,
J. Yogapriya
Publication year - 2019
Publication title -
international journal of engineering and advanced technology
Language(s) - English
Resource type - Journals
ISSN - 2249-8958
DOI - 10.35940/ijeat.a9414.109119
Subject(s) - computer science , data mining , database transaction , association rule learning , profit (economics) , reputation , tree (set theory) , tree structure , set (abstract data type) , snapshot (computer storage) , information retrieval , database , binary tree , mathematics , algorithm , mathematical analysis , social science , sociology , programming language , economics , microeconomics
Conventional methods of Association rule mining and Frequent Itemset Mining (FIM) cannot satisfy the anxieties emerging from certain real applications. In real world, for making some decisions the user wants to know the total profit grossed by an itemset or item. To evaluate this it needs to take into account the quantity of the purchased item. The profit of an item considers the gain of single item and the number of item purchased. To address these, utility mining has been introduced. In this the utility of an itemset is calculated as the number of item purchased and the product of the gain of the item. Utility mining concentrates on both the reputation of an item in the knowledge base (i.e.) profit or exterior utility and the reputation of an item in the transaction (i.e.) quantity or interior utility of an item. In this study, a novel Improved frequent-pattern tree (IFP-tree) structure, which is an extended prefix-tree structure for storing crucial information about frequent patterns, and develop an efficient IFP-tree-based mining method based on the generation of conditional utility pattern base which leads to the conditional utility IFP-tree for mining the complete set of frequent patterns. As the item set in the sanitized database and original database are segmented in different areas, the itemset kept in the different areas are indexed through candidate keys for increasing the access speed and fast retrieval of data. This process will increase the accuracy of the database and it preserves the sensitive data items for a longer time. The assessment report shows that the generation of less candidate patterns makes algorithms to run faster.