z-logo
Premium
Variable and threshold selection to control predictive accuracy in logistic regression
Author(s) -
Kuk Anthony Y. C.,
Li Jialiang,
John Rush A.
Publication year - 2014
Publication title -
journal of the royal statistical society: series c (applied statistics)
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.205
H-Index - 72
eISSN - 1467-9876
pISSN - 0035-9254
DOI - 10.1111/rssc.12058
Subject(s) - akaike information criterion , overfitting , logistic regression , statistics , lasso (programming language) , information criteria , mathematics , model selection , regression , ranking (information retrieval) , selection (genetic algorithm) , econometrics , computer science , artificial intelligence , artificial neural network , world wide web
Summary Using data collected from the ‘Sequenced treatment alternatives to relieve depression’ study, we use logistic regression to predict whether a patient will respond to treatment on the basis of early symptom change and patient characteristics. Model selection criteria such as the Akaike information criterion AIC and mean‐squared‐error of prediction MSEP may not be appropriate if the aim is to predict with a high degree of certainty who will respond or not respond to treatment. Towards this aim, we generalize the definition of the positive and negative predictive value curves to the case of multiple predictors. We point out that it is the ordering rather than the precise values of the response probabilities which is important, and we arrive at a unified approach to model selection via two‐sample rank tests. To avoid overfitting, we define a cross‐validated version of the positive and negative predictive value curves and compare these curves after smoothing for various models. When applied to the study data, we obtain a ranking of models that differs from those based on AIC and MSEP, as well as a tree‐based method and regularized logistic regression using a lasso penalty. Our selected model performs consistently well for both 4‐week‐ahead and 7‐week‐ahead predictions.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here