Premium
Rounding Strategies for Multiply Imputed Binary Data
Author(s) -
Demirtas Hakan
Publication year - 2009
Publication title -
biometrical journal
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.108
H-Index - 63
eISSN - 1521-4036
pISSN - 0323-3847
DOI - 10.1002/bimj.200900018
Subject(s) - rounding , imputation (statistics) , missing data , computer science , multivariate normal distribution , binary data , data mining , statistics , marginal distribution , binary number , econometrics , multivariate statistics , mathematics , random variable , arithmetic , operating system
Multiple imputation (MI) has emerged in the last two decades as a frequently used approach in dealing with incomplete data. Gaussian and log‐linear imputation models are fairly straightforward to implement for continuous and discrete data, respectively. However, in missing data settings that include a mix of continuous and discrete variables, the lack of flexible models for the joint distribution of different types of variables can make the specification of the imputation model a daunting task. The widespread availability of software packages that are capable of carrying out MI under the assumption of joint multivariate normality allows applied researchers to address this complication pragmatically by treating the discrete variables as continuous for imputation purposes and subsequently rounding the imputed values to the nearest observed category. In this article, we compare several rounding rules for binary variables based on simulated longitudinal data sets that have been used to illustrate other missing‐data techniques. Using a combination of conditional and marginal data generation mechanisms and imputation models, we study the statistical properties of multiple‐imputation‐based estimates for various population quantities under different rounding rules from bias and coverage standpoints. We conclude that a good rule should be driven by borrowing information from other variables in the system rather than relying on the marginal characteristics and should be relatively insensitive to imputation model specifications that may potentially be incompatible with the observed data. We also urge researchers to consider the applied context and specific nature of the problem, to avoid uncritical and possibly inappropriate use of rounding in imputation models.