z-logo
Premium
Statistical‐learning strategies generate only modestly performing predictive models for urinary symptoms following external beam radiotherapy of the prostate: A comparison of conventional and machine‐learning methods
Author(s) -
Yahya Noorazrul,
Ebert Martin A.,
Bulsara Max,
House Michael J.,
Kennedy Angel,
Joseph David J.,
Denham James W.
Publication year - 2016
Publication title -
medical physics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.473
H-Index - 180
eISSN - 2473-4209
pISSN - 0094-2405
DOI - 10.1118/1.4944738
Subject(s) - multivariate adaptive regression splines , dysuria , random forest , logistic regression , mars exploration program , machine learning , support vector machine , artificial intelligence , artificial neural network , receiver operating characteristic , external beam radiotherapy , medicine , statistics , computer science , radiation therapy , regression analysis , bayesian multivariate linear regression , mathematics , urinary system , surgery , brachytherapy , physics , astronomy
Purpose: Given the paucity of available data concerning radiotherapy‐induced urinary toxicity, it is important to ensure derivation of the most robust models with superior predictive performance. This work explores multiple statistical‐learning strategies for prediction of urinary symptoms following external beam radiotherapy of the prostate. Methods: The performance of logistic regression, elastic‐net, support‐vector machine, random forest, neural network, and multivariate adaptive regression splines (MARS) to predict urinary symptoms was analyzed using data from 754 participants accrued by TROG03.04‐RADAR. Predictive features included dose‐surface data, comorbidities, and medication‐intake. Four symptoms were analyzed: dysuria, haematuria, incontinence, and frequency, each with three definitions (grade ≥ 1, grade ≥ 2 and longitudinal) with event rate between 2.3% and 76.1%. Repeated cross‐validations producing matched models were implemented. A synthetic minority oversampling technique was utilized in endpoints with rare events. Parameter optimization was performed on the training data. Area under the receiver operating characteristic curve (AUROC) was used to compare performance using sample size to detect differences of ≥0.05 at the 95% confidence level. Results: Logistic regression, elastic‐net, random forest, MARS, and support‐vector machine were the highest‐performing statistical‐learning strategies in 3, 3, 3, 2, and 1 endpoints, respectively. Logistic regression, MARS, elastic‐net, random forest, neural network, and support‐vector machine were the best, or were not significantly worse than the best, in 7, 7, 5, 5, 3, and 1 endpoints. The best‐performing statistical model was for dysuria grade ≥ 1 with AUROC ± standard deviation of 0.649 ± 0.074 using MARS. For longitudinal frequency and dysuria grade ≥ 1, all strategies produced AUROC>0.6 while all haematuria endpoints and longitudinal incontinence models produced AUROC<0.6. Conclusions: Logistic regression and MARS were most likely to be the best‐performing strategy for the prediction of urinary symptoms with elastic‐net and random forest producing competitive results. The predictive power of the models was modest and endpoint‐dependent. New features, including spatial dose maps, may be necessary to achieve better models.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here