z-logo
Premium
Interactive knowledge discovery from hidden data through sampling of frequent patterns
Author(s) -
Bhuiyan Mansurul,
Hasan Mohammad Al
Publication year - 2016
Publication title -
statistical analysis and data mining: the asa data science journal
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.381
H-Index - 33
eISSN - 1932-1872
pISSN - 1932-1864
DOI - 10.1002/sam.11322
Subject(s) - computer science , data mining , task (project management) , sampling (signal processing) , knowledge extraction , session (web analytics) , markov chain monte carlo , confidentiality , revenue , set (abstract data type) , data science , hidden markov model , machine learning , artificial intelligence , world wide web , bayesian probability , filter (signal processing) , finance , computer security , economics , programming language , management , computer vision
In real life, many important datasets are not publicly accessible due to various reasons, including privacy protection and maintenance of business competitiveness. However, Knowledge discovery and pattern mining from these datasets can bring enormous benefit both to the data owner and the external entities. In this paper, we propose a novel solution for this task, which is based on Markov chain Monte Carlo (MCMC) sampling of frequent patterns. Instead of returning all the frequent patterns, the proposed paradigm sends back a small set of randomly selected patterns so that the confidentiality of the dataset can be maintained. Our solution also allows interactive sampling, so that the sampled patterns can fulfill the user's requirement effectively. We show experimental results from several real‐life datasets to validate the capability and usefulness of our solution. In particular, we show examples that by using our proposed solution, an eCommerce marketplace can allow pattern mining on user session data without disclosing the data to the public; such a mining paradigm can help the sellers in the marketplace, which eventually can boost the market's own revenue. © 2016 Wiley Periodicals, Inc. Statistical Analysis and Data Mining: The ASA Data Science Journal, 2016

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here