Premium
Explained Variation for Logistic Regression – Small Sample Adjustments, Confidence Intervals and Predictive Precision
Author(s) -
Mittlböck M.,
Schemper M.
Publication year - 2002
Publication title -
biometrical journal
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.108
H-Index - 63
eISSN - 1521-4036
pISSN - 0323-3847
DOI - 10.1002/1521-4036(200204)44:3<263::aid-bimj263>3.0.co;2-7
Subject(s) - statistics , covariate , mathematics , confidence interval , logistic regression , linear regression , regression , regression analysis , econometrics
The proportion of explained variation in logistic regression can be expressed by the multiple R 2 originally developed for the general linear model (cf. Mittlböck and Schemper (1996)). In this paper we present a detailed investigation of this measure in small samples and/or with many covariates and propose either of two adjustments, one being a direct analogue of R 2 adj of the general linear model, and the other being based on shrinkage. Furthermore, we explore the use of bootstrap confidence intervals and give a table of the expected variability of estimates of explained variation for samples of varying sizes. We recommend to quantify gains of predictive precision due to prognostic factors by both relative and absolute measures. For binary outcomes the components of the relative measure, R 2 , are suitable absolute measures of predictive precision. They are interpretable as average absolute residuals conditional on using prognostic factors and without such information. We motivate application of the presented measures by the statistical analysis of a study of physical characteristics of urine possibly related to the presence of calcium oxalate crystals.