Premium
Empirical‐likelihood‐based inference in missing response problems and its application in observational studies
Author(s) -
Qin Jing,
Zhang Biao
Publication year - 2007
Publication title -
journal of the royal statistical society: series b (statistical methodology)
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 6.523
H-Index - 137
eISSN - 1467-9868
pISSN - 1369-7412
DOI - 10.1111/j.1467-9868.2007.00579.x
Subject(s) - missing data , estimator , empirical likelihood , mathematics , covariate , statistics , robustness (evolution) , likelihood function , estimating equations , regression analysis , inference , inverse probability weighting , curse of dimensionality , econometrics , computer science , estimation theory , artificial intelligence , biochemistry , chemistry , gene
Summary. The problem of missing response data is ubiquitous in medical and social science studies. In the case of responses that are missing at random (depending on some covariate information), analyses focused only on the complete data may lead to biased results. Various debias methods have been extensively studied in the literature, particularly the weighting method that was motivated by Horvitz and Thompson's estimators. To improve efficiency, Robins, Rotnitzky and Zhao proposed augmented estimating equations based on corrected complete‐case analyses. A nice feature of the augmented method is its ‘double robustness’, i.e. the estimator that is derived from the augmented method is asymptotically unbiased if either the underlying missing data mechanism or the underlying regression function is correctly specified. Furthermore, the augmented estimator can achieve full efficiency if both the missing data mechanism and the regression function are correctly specified. In general, however, it is very difficult to specify the regression function correctly, especially when the dimension of covariates is high— this is the so‐called curse of dimensionality problem. The augmented estimator has much lower efficiency if the ‘working regression model’ is not close to the true regression model. In this paper, the empirical likelihood method is employed to seek a constrained empirical likelihood estimation of mean response with the assumption that responses are missing at random. The empirical‐likelihood‐based estimators enjoy the double‐robustness property. Moreover, it is possible that the empirical‐likelihood‐based inference can produce asymptotically unbiased and efficient estimators even if the true regression function is not completely known. Simulation results indicate that the empirical‐likelihood‐based estimators are very robust to a misspecification of the propensity score and dominate other competitors in the sense of having smaller mean‐square errors. Methods that are developed in this paper have a nice application in observational causal inferences. The propensity score is used to adjust for differences in pretreatment variables in the estimation of average treatment effects.