Premium
Missing phenotype data imputation in pedigree data analysis
Author(s) -
Fridley Brooke L.,
de Andrade Mariza
Publication year - 2008
Publication title -
genetic epidemiology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.301
H-Index - 98
eISSN - 1098-2272
pISSN - 0741-0395
DOI - 10.1002/gepi.20261
Subject(s) - missing data , imputation (statistics) , markov chain monte carlo , bayesian probability , longitudinal data , data set , computer science , statistics , data mining , mathematics
Mapping complex traits or phenotypes with small genetic effects, whose phenotypes may be modulated by temporal trends in families are challenging. Detailed and accurate data must be available on families, whether or not the data were collected over time. Missing data complicate matters in pedigree analysis, especially in the case of a longitudinal pedigree analysis. Because most analytical methods developed for the analysis of longitudinal pedigree data require no missing data, the researcher is left with the option of dropping those cases (individuals) with missing data from the analysis or imputing values for the missing data. We present the use of data augmentation within Bayesian polygenic and longitudinal polygenic models to produce k complete datasets. The data augmentation, or imputation step of the Markov chain Monte Carlo, takes into account the observed familial information and the observed subject information available at other time points. These k complete datasets can then be used to fit single time point or longitudinal pedigree models. By producing a set of k complete datasets and thus k sets of parameter estimates, the total variance associated with an estimate can be partitioned into a within‐imputation and a between‐imputation component. The method is illustrated using the Genetic Analysis Workshop simulated data. Genet. Epidemiol . 2007. © 2007 Wiley‐Liss, Inc.