Premium
Two‐stage methods for the analysis of pooled data
Author(s) -
Stukel Therese A.,
Demidenko Eugene,
Dykes James,
Karagas Margaret R.
Publication year - 2001
Publication title -
statistics in medicine
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.996
H-Index - 183
eISSN - 1097-0258
pISSN - 0277-6715
DOI - 10.1002/sim.852
Subject(s) - covariate , statistics , random effects model , estimator , mixed model , sample size determination , linear model , generalized linear mixed model , confounding , standard error , econometrics , logistic regression , linear regression , mathematics , meta analysis , medicine
Abstract Epidemiologic studies of disease often produce inconclusive or contradictory results due to small sample sizes or regional variations in the disease incidence or the exposures. To clarify these issues, researchers occasionally pool and reanalyse original data from several large studies. In this paper we explore the use of a two‐stage random‐effects model for analysing pooled case‐control studies and undertake a thorough examination of bias in the pooled estimator under various conditions. The two‐stage model analyses each study using the model appropriate to the design with study‐specific confounders, and combines the individual study‐specific adjusted log‐odds ratios using a linear mixed‐effects model; it is computationally simple and can incorporate study‐level covariates and random effects. Simulations indicate that when the individual studies are large, two‐stage methods produce nearly unbiased exposure estimates and standard errors of the exposure estimates from a generalized linear mixed model. By contrast, joint fixed‐effects logistic regression produces attenuated exposure estimates and underestimates the standard error when heterogeneity is present. While bias in the pooled regression coefficient increases with interstudy heterogeneity for both models, it is much smaller using the two‐stage model. In pooled analyses, where covariates may not be uniformly defined and coded across studies, and occasionally not measured in all studies, a joint model is often not feasible. The two‐stage method is shown to be a simple, valid and practical method for the analysis of pooled binary data. The results are applied to a study of reproductive history and cutaneous melanoma risk in women using data from ten large case‐control studies. Copyright © 2001 John Wiley & Sons, Ltd.