Validity of Privacy-Protecting Analytical Methods That Use Only Aggregate-Level Information to Conduct Multivariable-Adjusted Analysis in Distributed Data Networks | Zendy

Xiaojuan Li | Zendy; Bruce Fireman | Zendy; Jeffrey R. Curtis | Zendy; David Arterburn | Zendy; David Fisher | Zendy; Érick Moyneur | Zendy; M. J. Gallagher | Zendy; Marsha A. Raebel | Zendy; W. Benjamin Nowell | Zendy; Lindsay Lagreid | Zendy; Sengwee Toh | Zendy

Open Access

Validity of Privacy-Protecting Analytical Methods That Use Only Aggregate-Level Information to Conduct Multivariable-Adjusted Analysis in Distributed Data Networks

Author(s) -

Xiaojuan Li,

Bruce Fireman,

Jeffrey R. Curtis,

David Arterburn,

David Fisher,

Érick Moyneur,

M. J. Gallagher,

Marsha A. Raebel,

W. Benjamin Nowell,

Lindsay Lagreid,

Sengwee Toh

Publication year - 2018

Publication title -

american journal of epidemiology

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 2.33

H-Index - 256

eISSN - 1476-6256

pISSN - 0002-9262

DOI - 10.1093/aje/kwy265

Subject(s) - weighting , computer science , covariate , propensity score matching , matching (statistics) , confounding , statistics , data mining , aggregate (composite) , aggregate data , comparability , data set , inverse probability weighting , medicine , mathematics , machine learning , artificial intelligence , materials science , combinatorics , composite material , radiology

Distributed data networks enable large-scale epidemiologic studies, but protecting privacy while adequately adjusting for a large number of covariates continues to pose methodological challenges. Using 2 empirical examples within a 3-site distributed data network, we tested combinations of 3 aggregate-level data-sharing approaches (risk-set, summary-table, and effect-estimate), 4 confounding adjustment methods (matching, stratification, inverse probability weighting, and matching weighting), and 2 summary scores (propensity score and disease risk score) for binary and time-to-event outcomes. We assessed the performance of combinations of these data-sharing and adjustment methods by comparing their results with results from the corresponding pooled individual-level data analysis (reference analysis). For both types of outcomes, the method combinations examined yielded results identical or comparable to the reference results in most scenarios. Within each data-sharing approach, comparability between aggregate- and individual-level data analysis depended on adjustment method; for example, risk-set data-sharing with matched or stratified analysis of summary scores produced identical results, while weighted analysis showed some discrepancies. Across the adjustment methods examined, risk-set data-sharing generally performed better, while summary-table and effect-estimate data-sharing more often produced discrepancies in settings with rare outcomes and small sample sizes. Valid multivariable-adjusted analysis can be performed in distributed data networks without sharing of individual-level data.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore