z-logo
open-access-imgOpen Access
Ridge regression and its applications in genetic studies
Author(s) -
Mohammad Arashi,
Mahdi Roozbeh,
Nor Aishah Hamzah,
Mauro Gasparini
Publication year - 2021
Publication title -
plos one
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.99
H-Index - 332
ISSN - 1932-6203
DOI - 10.1371/journal.pone.0245376
Subject(s) - multicollinearity , estimator , ridge , outlier , statistics , regression analysis , regression , linear regression , mathematics , computer science , biology , paleontology
With the advancement of technology, analysis of large-scale data of gene expression is feasible and has become very popular in the era of machine learning. This paper develops an improved ridge approach for the genome regression modeling. When multicollinearity exists in the data set with outliers, we consider a robust ridge estimator, namely the rank ridge regression estimator, for parameter estimation and prediction. On the other hand, the efficiency of the rank ridge regression estimator is highly dependent on the ridge parameter. In general, it is difficult to provide a satisfactory answer about the selection for the ridge parameter. Because of the good properties of generalized cross validation (GCV) and its simplicity, we use it to choose the optimum value of the ridge parameter. The GCV function creates a balance between the precision of the estimators and the bias caused by the ridge estimation. It behaves like an improved estimator of risk and can be used when the number of explanatory variables is larger than the sample size in high-dimensional problems. Finally, some numerical illustrations are given to support our findings.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here