Premium
Control of confounding through secondary samples
Author(s) -
Yin Li,
Sundberg Rolf,
Wang Xiaoqin,
Rubin Donald B.
Publication year - 2006
Publication title -
statistics in medicine
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.996
H-Index - 183
eISSN - 1097-0258
pISSN - 0277-6715
DOI - 10.1002/sim.2468
Subject(s) - confounding , sample size determination , statistics , inference , sample (material) , econometrics , mathematics , computer science , chemistry , artificial intelligence , chromatography
The control of confounding is essential in many statistical problems, especially in those that attempt to estimate exposure effects. In some cases, in addition to the ‘primary’ sample, there is another ‘secondary’ sample which, though having no direct information about the exposure effect, contains information about the confounding factors. The purpose of this article is to study the influence of the secondary sample on likelihood inference for the exposure effect. In particular, we investigate the interplay between the efficiency improvement and the possible bias introduced by the secondary sample as a function of the degree of confounding in the primary sample and the sizes of the primary and secondary samples. In the case of weak confounding, the secondary sample can only little improve estimation of the exposure effect, whereas with strong confounding the secondary sample can be much more useful. On the other hand, it will be more important to consider possible biasing effects in the latter case. For illustration, we use a formal example of a generalized linear model and a real example with sparse data from a case–control study of the association between gastric cancer and HM‐CAP/Band 120. Copyright © 2006 John Wiley & Sons, Ltd.