z-logo
Premium
Covariate selection for multilevel models with missing data
Author(s) -
Marino Miguel,
Buxton Orfeu M.,
Li Yi
Publication year - 2017
Publication title -
stat
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.61
H-Index - 18
ISSN - 2049-1573
DOI - 10.1002/sta4.133
Subject(s) - covariate , missing data , imputation (statistics) , lasso (programming language) , computer science , feature selection , multilevel model , statistics , regression analysis , regression , data mining , econometrics , mathematics , machine learning , world wide web
Missing covariate data hamper variable selection in multilevel regression settings. Current variable selection techniques for multiply‐imputed data commonly address missingness in the predictors through list‐wise deletion and stepwise‐selection methods that are problematic. Moreover, most variable selection methods are developed for independent linear regression models and do not accommodate multilevel mixed effects regression models with incomplete covariate data. We develop a novel methodology that is able to perform covariate selection across multiply‐imputed data for multilevel random effects models when missing data are present. Specifically, we propose to stack the multiply‐imputed data sets from a multiple imputation procedure and to apply a group variable selection procedure through group lasso regularization to assess the overall impact of each predictor on the outcome across the imputed data sets. Simulations confirm the advantageous performance of the proposed method compared with the competing methods. We applied the method to reanalyse the Healthy Directions–Small Business cancer prevention study, which evaluated a behavioural intervention programme targeting multiple risk‐related behaviours in a working‐class, multi‐ethnic population. Copyright © 2017 John Wiley & Sons, Ltd.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here