Posterior predictive checking of multiple imputation models | Zendy

Nguyen Cattram D. | Zendy; Lee Katherine J. | Zendy; Carlin John B. | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Premium

Posterior predictive checking of multiple imputation models

Author(s) -

Nguyen Cattram D.,

Lee Katherine J.,

Carlin John B.

Publication year - 2015

Publication title -

biometrical journal

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 1.108

H-Index - 63

eISSN - 1521-4036

pISSN - 0323-3847

DOI - 10.1002/bimj.201400034

Subject(s) - imputation (statistics) , posterior predictive distribution , missing data , computer science , posterior probability , predictive value , statistics , data mining , mathematics , artificial intelligence , machine learning , bayesian probability , bayesian linear regression , medicine , bayesian inference

Multiple imputation is gaining popularity as a strategy for handling missing data, but there is a scarcity of tools for checking imputation models, a critical step in model fitting. Posterior predictive checking (PPC) has been recommended as an imputation diagnostic. PPC involves simulating “replicated” data from the posterior predictive distribution of the model under scrutiny. Model fit is assessed by examining whether the analysis from the observed data appears typical of results obtained from the replicates produced by the model. A proposed diagnostic measure is the posterior predictive “ p ‐value”, an extreme value of which (i.e., a value close to 0 or 1) suggests a misfit between the model and the data. The aim of this study was to evaluate the performance of the posterior predictive p ‐value as an imputation diagnostic. Using simulation methods, we deliberately misspecified imputation models to determine whether posterior predictive p ‐values were effective in identifying these problems. When estimating the regression parameter of interest, we found that more extreme p ‐values were associated with poorer imputation model performance, although the results highlighted that traditional thresholds for classical p ‐values do not apply in this context. A shortcoming of the PPC method was its reduced ability to detect misspecified models with increasing amounts of missing data. Despite the limitations of posterior predictive p ‐values, they appear to have a valuable place in the imputer's toolkit. In addition to automated checking using p ‐values, we recommend imputers perform graphical checks and examine other summaries of the test quantity distribution.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore