Premium
Structured sparse logistic regression with application to lung cancer prediction using breath volatile biomarkers
Author(s) -
Zhang Xiaochen,
Zhang Qingzhao,
Wang Xiaofeng,
Ma Shuangge,
Fang Kuangnan
Publication year - 2019
Publication title -
statistics in medicine
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.996
H-Index - 183
eISSN - 1097-0258
pISSN - 0277-6715
DOI - 10.1002/sim.8454
Subject(s) - logistic regression , minimax , lung cancer , computer science , lasso (programming language) , stability (learning theory) , group (periodic table) , mathematical optimization , mathematics , machine learning , medicine , chemistry , organic chemistry , world wide web
This article is motivated by a study of lung cancer prediction using breath volatile organic compound (VOC) biomarkers, where the challenge is that the predictors include not only high‐dimensional time‐dependent or functional VOC features but also the time‐independent clinical variables. We consider a high‐dimensional logistic regression and propose two different penalties: group spline‐penalty or group smooth‐penalty to handle the group structures of the time‐dependent variables in the model. The new methods have the advantage for the situation where the model coefficients are sparse but change smoothly within the group, compared with other existing methods such as the group lasso and the group bridge approaches. Our methods are easy to implement since they can be turned into a group minimax concave penalty problem after certain transformations. We show that our fitting algorithm possesses the descent property and leads to attractive convergence properties. The simulation studies and the lung cancer application are performed to demonstrate the accuracy and stability of the proposed approaches.