Premium
Comparison of missing value imputation methods for crop yield data
Author(s) -
Lokupitiya Ravindra S.,
Lokupitiya Erandathie,
Paustian Keith
Publication year - 2006
Publication title -
environmetrics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.68
H-Index - 58
eISSN - 1099-095X
pISSN - 1180-4009
DOI - 10.1002/env.773
Subject(s) - missing data , imputation (statistics) , statistics , smoothing , kriging , mathematics , regression , econometrics , computer science
Most ecological data sets contain missing values, a fact which can cause problems in the analysis and limit the utility of resulting inference. However, ecological data also tend to be spatially correlated, which can aid in estimating and imputing missing values. We compared four existing methods of estimating missing values: regression, kernel smoothing, universal kriging, and multiple imputation. Data on crop yields from the National Agricultural Statistical Survey (NASS) and the Census of Agriculture (Ag Census) were the basis for our analysis. Our goal was to find the best method to impute missing values in the NASS datasets. For this comparison, we selected the NASS data for barley crop yield in 1997 as our reference dataset. We found in this case that multiple imputation and regression were superior to methods based on spatial correlation. Universal kriging was found to be the third best method. Kernel smoothing seemed to perform very poorly. Copyright © 2005 John Wiley & Sons, Ltd.