z-logo
open-access-imgOpen Access
An Improved Sanitization Algorithm in Privacy-Preserving Utility Mining
Author(s) -
Xuan Liu,
Genlang Chen,
Sheng Wen,
Guanghui Song
Publication year - 2020
Publication title -
mathematical problems in engineering
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.262
H-Index - 62
eISSN - 1026-7077
pISSN - 1024-123X
DOI - 10.1155/2020/7489045
Subject(s) - computer science , database transaction , data mining , heuristic , private information retrieval , process (computing) , information sensitivity , information loss , algorithm , artificial intelligence , database , computer security , operating system
High-utility pattern mining is an effective technique that extracts significant information from varied types of databases. However, the analysis of data with sensitive private information may cause privacy concerns. To achieve better trade-off between utility maximizing and privacy preserving, privacy-preserving utility mining (PPUM) has become an important research topic in recent years. The MSICF algorithm is a sanitization algorithm for PPUM. It selects the item based on the conflict count and identifies the victim transaction based on the concept of utility. Although MSICF is effective, the heuristic selection strategy can be improved to obtain a lower ratio of side effects. In our paper, we propose an improved sanitization approach named the Improved Maximum Sensitive Itemsets Conflict First Algorithm (IMSICF) to address this issue. It dynamically calculates conflict counts of sensitive items in the sanitization process. In addition, IMSICF chooses the transaction with the minimum number of nonsensitive itemsets and the maximum utility in a sensitive itemset for modification. Extensive experiments have been conducted on various datasets to evaluate the effectiveness of our proposed algorithm. The results show that IMSICF outperforms other state-of-the-art algorithms in terms of minimizing side effects on nonsensitive information. Moreover, the influence of correlation among itemsets on various sanitization algorithms’ performance is observed.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here