Using partially synthetic microdata to protect sensitive cells in business statistics
Author(s) -
Javier Miranda,
Lars Vilhuber
Publication year - 2016
Publication title -
statistical journal of the iaos
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.286
H-Index - 16
eISSN - 1875-9254
pISSN - 1874-7655
DOI - 10.3233/sji-160963
Subject(s) - microdata (statistics) , synthetic data , census , computer science , public use , statistics , econometrics , data science , data mining , mathematics , artificial intelligence , political science , sociology , population , demography , law
We describe and analyze a method that blends records from both observed and synthetic microdata into public-use tabulations on establishment statistics. The resulting tables use synthetic data only in potentially sensitive cells. We describe different algorithms, and present preliminary results when applied to the Census Bureau's Business Dynamics Statistics and Synthetic Longitudinal Business Database, highlighting accuracy and protection afforded by the method when compared to existing public-use tabulations (with suppressions).Vilhuber acknowledges support through NSF Grants SES-1042181 and BCS-0941226
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom