z-logo
open-access-imgOpen Access
Two penalized mixed–integer nonlinear programming approaches to tackle multicollinearity and outliers effects in linear regression models
Author(s) -
Mahdi Roozbeh,
Saman Babaie–Kafaki,
Zohre Aminifard
Publication year - 2020
Publication title -
journal of industrial and management optimization
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.325
H-Index - 32
eISSN - 1553-166X
pISSN - 1547-5816
DOI - 10.3934/jimo.2020128
Subject(s) - multicollinearity , variance inflation factor , outlier , overfitting , estimator , robust regression , statistics , mathematics , computer science , linear regression , econometrics , mathematical optimization , machine learning , artificial neural network
In classical regression analysis, the ordinary least–squares estimation is the best strategy when the essential assumptions such as normality and independency to the error terms as well as ignorable multicollinearity in the covariates are met. However, if one of these assumptions is violated, then the results may be misleading. Especially, outliers violate the assumption of normally distributed residuals in the least–squares regression. In this situation, robust estimators are widely used because of their lack of sensitivity to outlying data points. Multicollinearity is another common problem in multiple regression models with inappropriate effects on the least–squares estimators. So, it is of great importance to use the estimation methods provided to tackle the mentioned problems. As known, robust regressions are among the popular methods for analyzing the data that are contaminated with outliers. In this guideline, here we suggest two mixed–integer nonlinear optimization models which their solutions can be considered as appropriate estimators when the outliers and multicollinearity simultaneously appear in the data set. Capable to be effectively solved by metaheuristic algorithms, the models are designed based on penalization schemes with the ability of down–weighting or ignoring unusual data and multicollinearity effects. We establish that our models are computationally advantageous in the perspective of the flop count. We also deal with a robust ridge methodology. Finally, three real data sets are analyzed to examine performance of the proposed methods.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom