Premium
Just‐Identified Versus Overidentified Two‐Level Hierarchical Linear Models with Missing Data
Author(s) -
Shin Yongyun,
Raudenbush Stephen W.
Publication year - 2007
Publication title -
biometrics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 2.298
H-Index - 130
eISSN - 1541-0420
pISSN - 0006-341X
DOI - 10.1111/j.1541-0420.2007.00818.x
Subject(s) - missing data , imputation (statistics) , covariate , joint probability distribution , statistics , marginal model , hierarchical database model , linear regression , regression analysis , multilevel model , mathematics , linear model , transformation (genetics) , computer science , data mining , biochemistry , chemistry , gene
Summary The development of model‐based methods for incomplete data has been a seminal contribution to statistical practice. Under the assumption of ignorable missingness, one estimates the joint distribution of the complete data for θ∈Θ from the incomplete or observed data y obs . Many interesting models involve one‐to‐one transformations of θ. For example, with y i ∼ N (μ, Σ) for i = 1, … , n and θ= (μ, Σ) , an ordinary least squares (OLS) regression model is a one‐to‐one transformation of θ. Inferences based on such a transformation are equivalent to inferences based on OLS using data multiply imputed from f ( y mis | y obs , θ) for missing y mis . Thus, identification of θ from y obs is equivalent to identification of the regression model. In this article, we consider a model for two‐level data with continuous outcomes where the observations within each cluster are dependent. The parameters of the hierarchical linear model (HLM) of interest, however, lie in a subspace of Θ in general. This identification of the joint distribution overidentifies the HLM. We show how to characterize the joint distribution so that its parameters are a one‐to‐one transformation of the parameters of the HLM. This leads to efficient estimation of the HLM from incomplete data using either the transformation method or the method of multiple imputation. The approach allows outcomes and covariates to be missing at either of the two levels, and the HLM of interest can involve the regression of any subset of variables on a disjoint subset of variables conceived as covariates.