z-logo
Premium
Selecting variables in non‐parametric regression models for binary response. An application to the computerized detection of breast cancer
Author(s) -
RocaPardiñas Javier,
CadarsoSuárez Carmen,
Tahoces Pablo G.,
Lado María J.
Publication year - 2008
Publication title -
statistics in medicine
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.996
H-Index - 183
eISSN - 1097-0258
pISSN - 0277-6715
DOI - 10.1002/sim.3472
Subject(s) - receiver operating characteristic , logistic regression , breast cancer , parametric statistics , computer science , statistics , model selection , selection (genetic algorithm) , feature selection , regression analysis , mathematics , artificial intelligence , cancer , medicine
In many biomedical applications, interest lies in being able to distinguish between two possible states of a given response variable, depending on the values of certain continuous predictors. If the number of predictors, p , is high, or if there is redundancy among them, it then becomes important to decide on the selection of the best subset of predictors that will be able to obtain the models with greatest discrimination capacity. With this aim in mind, logistic generalized additive models were considered and receiver operating characteristic (ROC) curves were applied in order to determine and compare the discriminatory capacity of such models. This study sought to develop bootstrap‐based tests that allow for the following to be ascertained: (a) the optimal number q ⩽ p of predictors; and (b) the model or models including q predictors, which display the largest AUC (area under the ROC curve). A simulation study was conducted to verify the behaviour of these tests. Finally, the proposed method was applied to a computer‐aided diagnostic system dedicated to early detection of breast cancer. Copyright © 2008 John Wiley & Sons, Ltd.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here