Premium
Analysis of case‐control studies of genetic and environmental factors with missing genetic information and haplotype‐phase ambiguity
Author(s) -
Spinka Christine,
Carroll Raymond J.,
Chatterjee Nilanjan
Publication year - 2005
Publication title -
genetic epidemiology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.301
H-Index - 98
eISSN - 1098-2272
pISSN - 0741-0395
DOI - 10.1002/gepi.20085
Subject(s) - context (archaeology) , missing data , ambiguity , population , expectation–maximization algorithm , econometrics , statistics , genetics , computer science , biology , mathematics , maximum likelihood , medicine , paleontology , environmental health , programming language
Abstract Case‐control studies of unrelated subjects are now widely used to study the role of genetic susceptibility and gene‐environment interactions in the etiology of complex diseases. Exploiting an assumption of gene‐environment independence, and treating the distribution of environmental exposures as completely nonparametric, Chatterjee and Carroll [2005] (Biometrika 92:399–418) recently developed an efficient retrospective maximum‐likelihood method for analysis of case‐control studies. In this article, we develop an extension of the retrospective maximum‐likelihood approach to studies where genetic information may be missing on some study subjects. In particular, special emphasis is given to haplotype‐based studies where missing data arise due to linkage‐phase ambiguity of genotype data. We use a profile likelihood technique and an appropriate expectation‐maximization (EM) algorithm to derive a relatively simple procedure for parameter estimation, with or without a rare disease assumption, and possibly incorporating information on the marginal probability of the disease for the underlying population. We also describe two alternative robust approaches that are less sensitive to the underlying gene‐environment independence and Hardy‐Weinberg‐equilibrium assumptions. The performance of the proposed methods is studied using simulation studies in the context of haplotype‐based studies of gene‐environment interactions. An application of the proposed method is illustrated using a case‐control study of ovarian cancer designed to investigate the interaction between BRCA1/2 mutations and reproductive risk factors in the etiology of ovarian cancer. Genet. Epidemiol. , 2005. Published 2005 Wiley‐Liss, Inc.