z-logo
open-access-imgOpen Access
Winning the NIST Contest: A scalable and general approach to differentially private synthetic data
Author(s) -
Ryan McKenna,
Gerome Miklau,
Daniel Sheldon
Publication year - 2021
Publication title -
the journal of privacy and confidentiality
Language(s) - English
Resource type - Journals
ISSN - 2575-8527
DOI - 10.29012/jpc.778
Subject(s) - nist , synthetic data , computer science , scalability , parametric statistics , contest , noise (video) , data mining , artificial intelligence , mathematics , statistics , speech recognition , database , political science , law , image (mathematics)
We propose a general approach for differentially private synthetic data generation, that consists of three steps: (1) select a collection of low-dimensional marginals, (2) measure those marginals with a noise addition mechanism, and (3) generate synthetic data that preserves the measured marginals well. Central to this approach is Private-PGM, a post-processing method that is used to estimate a high-dimensional data distribution from noisy measurements of its marginals. We present two mechanisms, NIST-MST and MST, that are instances of this general approach. NIST-MST was the winning mechanism in the 2018 NIST differential privacy synthetic data competition, and MST is a new mechanism that can work in more general settings, while still performing comparably to NIST-MST. We believe our general approach should be of broad interest, and can be adopted in future mechanisms for synthetic data generation.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here