Premium
Semiparametric pseudoscore for regression with multidimensional but incompletely observed regressor
Author(s) -
Hu Zonghui,
Qin Jing,
Follmann Dean
Publication year - 2017
Publication title -
statistics in medicine
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.996
H-Index - 183
eISSN - 1097-0258
pISSN - 0277-6715
DOI - 10.1002/sim.7253
Subject(s) - missing data , estimator , mathematics , curse of dimensionality , semiparametric regression , statistics , conditional expectation , econometrics
We study the regression f β ( Y | X , Z ), where Y is the response, Z ∈ R dis a vector of fully observed regressors and X is the regressor with incomplete observation. To handle missing data, maximum likelihood estimation via expectation‐maximisation (EM) is the most efficient but is sensitive to the specification of the distribution of X . Under a missing at random assumption, we propose an EM‐type estimation via a semiparametric pseudoscore. Like in EM, we derive the conditional expectation of the score function given Y and Z , or the mean score, over the incompletely observed units under a postulated distribution of X . Instead of directly using the ‘mean score’ in estimating equation, we use it as a working index to construct the semiparametric pseudoscore via nonparametric regression. Introduction of semiparametric pseudoscore into the EM framework reduces sensitivity to the specified distribution of X . It also avoids the curse of dimensionality when Z is multidimensional. The resulting regression estimator is more than doubly robust: it is consistent if either the pattern of missingness in X is correctly specified or the working index is appropriately , but not necessarily correctly, specified. It attains optimal efficiency when both conditions are satisfied. Numerical performance is explored by Monte Carlo simulations and a study on treating hepatitis C patients with HIV coinfection. Published 2017. This article is a U.S. Government work and is in the public domain in the USA