z-logo
Premium
High‐dimensional variable selection and prediction under competing risks with application to SEER‐Medicare linked data
Author(s) -
Hou Jiayi,
Paravati Anthony,
Hou Jue,
Xu Ronghui,
Murphy James
Publication year - 2018
Publication title -
statistics in medicine
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.996
H-Index - 183
eISSN - 1097-0258
pISSN - 0277-6715
DOI - 10.1002/sim.7822
Subject(s) - covariate , proportional hazards model , computer science , regression analysis , event (particle physics) , event data , statistics , variable (mathematics) , regression , feature selection , model selection , selection (genetic algorithm) , econometrics , data mining , machine learning , mathematics , mathematical analysis , physics , quantum mechanics
Competing risk analysis considers event times due to multiple causes or of more than one event types. Commonly used regression models for such data include (1) cause‐specific hazards model, which focuses on modeling one type of event while acknowledging other event types simultaneously, and (2) subdistribution hazards model, which links the covariate effects directly to the cumulative incidence function. Their use in the presence of high‐dimensional predictors are largely unexplored. Motivated by an analysis using the linked SEER‐Medicare database for the purposes of predicting cancer versus noncancer mortality for patients with prostate cancer, we study the accuracy of prediction and variable selection of existing machine learning methods under both models using extensive simulation experiments, including different approaches to choosing penalty parameters in each method. We then apply the optimal approaches to the analysis of the SEER‐Medicare data.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here