z-logo
Premium
Imputation: Methods, Simulation Experiments and Practical Examples
Author(s) -
Nordholt Eric Schulte
Publication year - 1998
Publication title -
international statistical review
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.051
H-Index - 54
eISSN - 1751-5823
pISSN - 0306-7734
DOI - 10.1111/j.1751-5823.1998.tb00412.x
Subject(s) - imputation (statistics) , missing data , computer science , statistics , econometrics , data mining , data quality , mathematics , engineering , operations management , metric (unit)
Summary When conducting surveys, two kinds of nonresponse may cause incomplete data files: unit nonresponse (complete nonresponse) and item nonresponse (partial nonresponse). The selectivity of the unit nonresponse is often corrected for. Various imputation techniques can be used for the missing values because of item nonresponse. Several of these imputation techniques are discussed in this report. One is the hot deck imputation. This paper describes two simulation experiments of the hot deck method. In the first study, data are randomly generated, and various percentages of missing values are then non‐randomly‘added’to the data. The hot deck method is used to reconstruct the data in this Monte Carlo experiment. The performance of the method is evaluated for the means, standard deviations, and correlation coefficients and compared with the available case method. In the second study, the quality of an imputation method is studied by running a simulation experiment. A selection of the data of the Dutch Housing Demand Survey is perturbed by leaving out specific values on a variable. Again hot deck imputations are used to reconstruct the data. The imputations are then compared with the true values. In both experiments the conclusion is that the hot deck method generally performs better than the available case method. This paper also deals with the questions which variables should be imputed and what the duration of the imputation process is. Finally the theory is illustrated by the imputation approaches of the Dutch Housing Demand Survey, the European Community Household Panel Survey (ECHP) and the new Dutch Structure of Earnings Survey (SES). These examples illustrate the levels of missing data that can be experienced in such surveys and the practical problems associated with choosing an appropriate imputation strategy for key items from each survey.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here