Premium
Stratified case sampling and the use of family controls
Author(s) -
Siegmund Kimberly D.,
Langholz Bryan
Publication year - 2001
Publication title -
genetic epidemiology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.301
H-Index - 98
eISSN - 1098-2272
pISSN - 0741-0395
DOI - 10.1002/gepi.3
Subject(s) - statistics , sampling (signal processing) , cousin , stratified sampling , efficiency , population , sampling design , random effects model , logistic regression , family history , mathematics , biology , genetics , econometrics , demography , computer science , medicine , meta analysis , estimator , archaeology , filter (signal processing) , sociology , computer vision , history
We compare the asymptotic relative efficiency (ARE) of different study designs for estimating gene and gene‐environment interaction effects using matched case‐control data. In the sampling schemes considered, cases are selected differentially based on their family history of disease. Controls are selected either from unrelated subjects or from among the case’s unaffected siblings and cousins. Parameters are estimated using weighted conditional logistic regression, where the likelihood contributions for each subject are weighted by the fraction of cases sampled sharing the same family history. Results showed that compared to random sampling, over‐sampling cases with a positive family history increased the efficiency for estimating the main effect of a gene for sib‐control designs (103–254% ARE) and decreased efficiency for cousin‐control and population‐control designs (68–94% ARE and 67–84% ARE, respectively). Population controls and random sampling of cases were most efficient for a recessive gene or a dominant gene with an relative risk less than 9. For estimating gene‐environment interactions, over‐sampling positive‐family‐history cases again led to increased efficiency using sib controls (111–180% ARE) and decreased efficiency using population controls (68–87% ARE). Using case‐cousin pairs, the results differed based on the genetic model and the size of the interaction effect; biased sampling was only slightly more efficient than random sampling for large interaction effects under a dominant gene model (relative risk ratio = 8, 106% ARE). Overall, the most efficient study design for studying gene‐environment interaction was the case‐sib‐control design with over‐sampling of positive‐family‐history‐cases. Genet. Epidemiol. 20:316–327, 2001. © 2001 Wiley‐Liss, Inc.