Premium
Two‐Stage sampling designs for gene association studies
Author(s) -
Thomas Duncan,
Xie Rongrong,
Gebregziabher Mulugeta
Publication year - 2004
Publication title -
genetic epidemiology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.301
H-Index - 98
eISSN - 1098-2272
pISSN - 0741-0395
DOI - 10.1002/gepi.20047
Subject(s) - stage (stratigraphy) , association (psychology) , sampling (signal processing) , statistics , biology , mathematics , computer science , psychology , paleontology , filter (signal processing) , computer vision , psychotherapist
Abstract We consider two‐stage case‐control designs for testing associations between single nucleotide polymorphisms (SNPs) and disease, in which a subsample of subjects is used to select a panel of “tagging” SNPs that will be considered in the main study. We propose a pseudolikelihood [Pepe and Flemming, 1991: JASA 86:108–113] that combines the information from both the main study and the substudy to test the association with any polymorphism in the original set. SNP‐tagging [Chapman et al., 2003: Hum Hered 56:18–31] and haplotype‐tagging [Stram et al., 2003a; Hum Hered 55:27–36] approaches are compared. We show that the cost‐efficiency of such a design for estimating the relative risk associated with the causal polymorphism can be considerably better than for a single‐stage design, even if the causal polymorphism is not included in the tag‐SNP set. We also consider the optimal selection of cases and controls in such designs and the relative efficiency for estimating the location of a causal variant in linkage disequilibrium mapping. Nevertheless, as the cost of high‐volume genotyping plummets and haplotype tagging information from the International HapMap project [Gibbs et al., 2003; Nature 426:789–796] rapidly accumulates in public databases, such two‐stage designs may soon become unnecessary. Genet. Epidemiol . © 2004 Wiley‐Liss, Inc.