z-logo
Premium
MEBoost: Variable selection in the presence of measurement error
Author(s) -
Brown Ben,
Weaver Timothy,
Wolfson Julian
Publication year - 2019
Publication title -
statistics in medicine
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.996
H-Index - 183
eISSN - 1097-0258
pISSN - 0277-6715
DOI - 10.1002/sim.8130
Subject(s) - covariate , lasso (programming language) , statistics , feature selection , computer science , consistency (knowledge bases) , observational error , selection (genetic algorithm) , regression , variable (mathematics) , data set , set (abstract data type) , mathematics , algorithm , artificial intelligence , mathematical analysis , world wide web , programming language
We present a novel method for variable selection in regression models when covariates are measured with error. The iterative algorithm we propose, M easurement E rror Boost ing (MEBoost), follows a path defined by estimating equations that correct for covariate measurement error. We illustrate the use of MEBoost in practice by analyzing data from the Box Lunch Study, a clinical trial in nutrition where several variables are based on self‐report and, hence, measured with error, where we are interested in performing model selection from a large data set to select variables that are related to the number of times a subject binge ate in the last 28 days. Furthermore, we evaluated our method and compared its performance to the recently proposed Convex Conditioned Lasso and to the “naive” Lasso, which does not correct for measurement error through a simulation study. Increasing the degree of measurement error increased prediction error and decreased the probability of accurate covariate selection, but this loss of accuracy occurred to a lesser degree when using MEBoost. Through simulations, we also make a case for the consistency of the model selected.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here