Premium
R 2 : a useful measure of model performance when predicting a dichotomous outcome
Author(s) -
Ash Arlene,
Shwartz Michael
Publication year - 1999
Publication title -
statistics in medicine
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.996
H-Index - 183
eISSN - 1097-0258
pISSN - 0277-6715
DOI - 10.1002/(sici)1097-0258(19990228)18:4<375::aid-sim20>3.0.co;2-j
Subject(s) - measure (data warehouse) , outcome (game theory) , statistics , computer science , econometrics , mathematics , data mining , mathematical economics
R 2 has been criticized as a measure of model performance when predicting a dichotomous outcome, both because its value is often low and because it is sensitive to the prevalence of the event of interest. The C statistic is more widely used to measure model performance in a 0/1 setting. We use a simple parametric family of models to illustrate the potential usefulness of models with low R 2 values, to clarify the effect of prevalence on both C and R 2 , and to demonstrate how R 2 captures information not picked up by C . We also show that C is subject to a ‘random mixing’ problem that does not affect R 2 . Finally, we report both R 2 and C values for different risk‐adjustment models in situations with different prevalences and show the relationship between the measures and decile death rates, thereby providing a context for interpreting R 2 values in a 0/1 setting. Copyright © 1999 John Wiley & Sons, Ltd.