How to deal with malicious users in privacy‐preserving distributed data mining | Zendy

Duan Yitao | Zendy; Canny John | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Premium

How to deal with malicious users in privacy‐preserving distributed data mining

Author(s) -

Duan Yitao,

Canny John

Publication year - 2009

Publication title -

statistical analysis and data mining: the asa data science journal

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.381

H-Index - 33

eISSN - 1932-1872

pISSN - 1932-1864

DOI - 10.1002/sam.10029

Subject(s) - computer science , random projection , data mining , id3 , cryptography , field (mathematics) , computation , encryption , information privacy , algorithm , decision tree , computer security , mathematics , pure mathematics , decision tree learning

A major problem in current privacy‐preserving data‐mining research is the lack of practical mechanisms to deal with malicious users who may submit bogus data to bias the computation. In this paper we explore private computation built on vector addition and its applications in privacy‐preserving data mining. We show that such a paradigm not only supports a large number of popular data‐mining algorithms (including both linear and nonlinear ones such as SVD, k ‐means, ID3, machine‐learning algorithms based on Expectation‐Maximization (EM), etc., and all algorithms in the statistical query model) with efficiency comparable to that of a regular, nonprivate implementation, but also admits extremely efficient zero‐knowledge (ZK) protocols for verifying properties of user data. These ZK protocols are based on random projection and use a linear number of inexpensive small field (e.g. 32 or 64 bits) operations, and only a logarithmic number of large‐field (1024 bits or more) cryptographic operations, achieving orders of magnitude reduction in running time over standard techniques (from hours to seconds) for large‐scale problems. Such ZK tools thus provide practical solutions for handling the malicious users problem for many real‐world applications. © 2009 Wiley Periodicals, Inc. Statistical Analysis and Data Mining 1: 18‐33, 2009

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here

Accelerating Research