z-logo
open-access-imgOpen Access
Influence of outliers on some multiple imputation methods
Author(s) -
Claudio Quintano,
Rosalia Castellano,
Antonella Rocca
Publication year - 2010
Publication title -
metodološki zvezki
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.127
H-Index - 7
eISSN - 1854-0031
pISSN - 1854-0023
DOI - 10.51936/tuki4538
Subject(s) - imputation (statistics) , missing data , outlier , markov chain monte carlo , computer science , data mining , statistics , data quality , survey sampling , markov chain , population , monte carlo method , econometrics , mathematics , artificial intelligence , engineering , metric (unit) , operations management , demography , sociology
In the field of data quality, imputation is the most used method for handling missing data. The performance of imputation techniques is influenced by various factors, especially when data represent only a sample of population, for example the survey design characteristics. In this paper, we compare the results of different multiple imputation methods in terms of final estimates when outliers occur in a dataset. Consequently, in order to evaluate the influence of outliers on the performance of these methods, the procedure is applied before and after that we have identified and removed them. For this purpose, missing data were simulated on data coming from sample ISTAT annual survey on Small and Medium Enterprises. MAR mechanism is assumed for missing data. The methods are based on the multiple imputation through the Markov Chain Monte Carlo (MCMC), the propensity score and the mixture models. The results highlight the strong influence of data characteristics on final estimates.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here