Premium
Dropping Highly Collinear Variables from a Model: Why it Typically is Not a Good Idea*
Author(s) -
O'Brien Robert M.
Publication year - 2017
Publication title -
social science quarterly
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.482
H-Index - 90
eISSN - 1540-6237
pISSN - 0038-4941
DOI - 10.1111/ssqu.12273
Subject(s) - multicollinearity , variables , variable (mathematics) , regression analysis , econometrics , statistics , venn diagram , regression diagnostic , mathematics , polynomial regression , mathematical analysis , mathematics education
Objective To change the common practice of eliminating independent variables from models because they produce multicollinearity in an independent variable of special interest. Methods I supplement my presentation, which is based on the purposes of regression analysis, by using Venn diagrams, simple formulas, and two small simulations. Results Independent variables that when removed from a model substantially change the statistics associated with the independent variable(s) of most interest are variables that should typically be kept in the model. Multicollinearity is not a sufficient reason to drop variables from a model. Conclusion I argue against the routine dropping of variables that cause multicollinearity in an independent variable of interest from regression models. A more important criterion to consider when contemplating dropping a variable from a model is “model influence.”