Premium
Pseudosibship methods in the case‐parents design
Author(s) -
Yu Zhaoxia,
Deng Li
Publication year - 2011
Publication title -
statistics in medicine
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.996
H-Index - 183
eISSN - 1097-0258
pISSN - 0277-6715
DOI - 10.1002/sim.4397
Subject(s) - matching (statistics) , multiplicative function , offspring , r package , allele , locus (genetics) , statistics , computer science , regression , biology , mathematics , genetics , gene , pregnancy , mathematical analysis
Recent evidence suggests that complex traits are likely determined by multiple loci, each of which contributes a weak to moderate individual effect. Although extensive literature exists on multilocus analysis of unrelated subjects, there are relatively fewer strategies for jointly analyzing multiple loci using family data. Here we address this issue by evaluating two pseudosibship methods: the 1:1 matching, which matches each affected offspring to the pseudosibling formed by the alleles not transmitted to the affected offspring, and the exhaustive matching, which matches each affected offspring to the pseudosiblings formed by all the other possible combinations of parental alleles. We prove that the two matching strategies use exactly and approximately the same amount of information from data under additive and multiplicative genetic models, respectively. Using numerical calculations under a variety of models and testing assumptions, we show that compared with the exhaustive matching, the 1:1 matching has comparable asymptotic power in detecting multiplicative/additive effects in single‐locus analysis and main effects in multilocus analysis, and it allows association testing of multiple linked loci. These results pave the way for many existing multilocus analysis methods developed for the case‐control (or matched case‐control) design to be applied to case‐parents data with minor modifications. As an example, with the 1:1 matching, we applied an L 1 regularized regression to a Crohn's disease dataset. Using the multiple loci selected in our approach, we obtained an order‐of‐magnitude decrease in p ‐value and an 18.9% increase in prediction accuracy when compared with using the most significant individual locus. Copyright © 2011 John Wiley & Sons, Ltd.