
Human microbiome privacy risks associated with summary statistics
Author(s) -
Jae-Chang Cho
Publication year - 2021
Publication title -
plos one
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.99
H-Index - 332
ISSN - 1932-6203
DOI - 10.1371/journal.pone.0249528
Subject(s) - microbiome , statistic , metadata , human microbiome , computer science , data science , internet privacy , biology , bioinformatics , statistics , world wide web , mathematics
Recognizing that microbial community composition within the human microbiome is associated with the physiological state of the host has sparked a large number of human microbiome association studies (HMAS). With the increasing size of publicly available HMAS data, the privacy risk is also increasing because HMAS metadata could contain sensitive private information. I demonstrate that a simple test statistic based on the taxonomic profiles of an individual’s microbiome along with summary statistics of HMAS data can reveal the membership of the individual’s microbiome in an HMAS sample. In particular, species-level taxonomic data obtained from small-scale HMAS can be highly vulnerable to privacy risk. Minimal guidelines for HMAS data privacy are suggested, and an assessment of HMAS privacy risk using the simulation method proposed is recommended at the time of study design.