Premium
Ridge Estimators in Logistic Regression
Author(s) -
Cessie S.,
Houwelingen J. C.
Publication year - 1992
Publication title -
journal of the royal statistical society: series c (applied statistics)
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.205
H-Index - 72
eISSN - 1467-9876
pISSN - 0035-9254
DOI - 10.2307/2347628
Subject(s) - ridge , logistic regression , statistics , estimator , geology , regression , mathematics , geography , econometrics , paleontology
SUMMARY In this paper it is shown how ridge estimators can be used in logistic regression to improve the parameter estimates and to diminish the error made by further predictions. Different ways to choose the unknown ridge parameter are discussed. The main attention focuses on ridge parameters obtained by cross‐validation. Three different ways to define the prediction error are considered: classification error, squared error and minus log‐likelihood. The use of ridge regression is illustrated by developing a prognostic index for the two‐year survival probability of patients with ovarian cancer as a function of their deoxyribonucleic acid (DNA) histogram. In this example, the number of covariates is large compared with the number of observations and modelling without restrictions on the parameters leads to overfitting. Defining a restriction on the parameters, such that neighbouring intervals in the DNA histogram differ only slightly in their influence on the survival, yields ridge‐type parameter estimates with reasonable values which can be clinically interpreted. Furthermore the model can predict new observations more accurately.