Prediction‐Based Structured Variable Selection through the Receiver Operating Characteristic Curves | Zendy

Wang Yuanjia | Zendy; Chen Huaihou | Zendy; Li Runze | Zendy; Duan Naihua | Zendy; LewisFernández Roberto | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Premium

Prediction‐Based Structured Variable Selection through the Receiver Operating Characteristic Curves

Author(s) -

Wang Yuanjia,

Chen Huaihou,

Li Runze,

Duan Naihua,

LewisFernández Roberto

Publication year - 2011

Publication title -

biometrics

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 2.298

H-Index - 130

eISSN - 1541-0420

pISSN - 0006-341X

DOI - 10.1111/j.1541-0420.2010.01533.x

Subject(s) - receiver operating characteristic , selection (genetic algorithm) , variable (mathematics) , computer science , feature selection , statistics , mathematics , artificial intelligence , machine learning , mathematical analysis

Summary In many clinical settings, a commonly encountered problem is to assess accuracy of a screening test for early detection of a disease. In these applications, predictive performance of the test is of interest. Variable selection may be useful in designing a medical test. An example is a research study conducted to design a new screening test by selecting variables from an existing screener with a hierarchical structure among variables: there are several root questions followed by their stem questions. The stem questions will only be asked after a subject has answered the root question. It is therefore unreasonable to select a model that only contains stem variables but not its root variable. In this work, we propose methods to perform variable selection with structured variables when predictive accuracy of a diagnostic test is the main concern of the analysis. We take a linear combination of individual variables to form a combined test. We then maximize a direct summary measure of the predictive performance of the test, the area under a receiver operating characteristic curve (AUC of an ROC), subject to a penalty function to control for overfitting. Since maximizing empirical AUC of the ROC of a combined test is a complicated nonconvex problem (Pepe, Cai, and Longton, 2006, Biometrics 62, 221–229), we explore the connection between the empirical AUC and a support vector machine (SVM). We cast the problem of maximizing predictive performance of a combined test as a penalized SVM problem and apply a reparametrization to impose the hierarchical structure among variables. We also describe a penalized logistic regression variable selection procedure for structured variables and compare it with the ROC‐based approaches. We use simulation studies based on real data to examine performance of the proposed methods. Finally we apply developed methods to design a structured screener to be used in primary care clinics to refer potentially psychotic patients for further specialty diagnostics and treatment.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore