z-logo
open-access-imgOpen Access
Patient-Specific Predictive Modeling Using Random Forests: An Observational Study for the Critically Ill
Author(s) -
Joon Lee
Publication year - 2017
Publication title -
jmir medical informatics
Language(s) - English
Resource type - Journals
ISSN - 2291-9694
DOI - 10.2196/medinform.6690
Subject(s) - critically ill , observational study , random forest , medicine , intensive care medicine , critical illness , predictive value , computer science , machine learning
Background With a large-scale electronic health record repository, it is feasible to build a customized patient outcome prediction model specifically for a given patient. This approach involves identifying past patients who are similar to the present patient and using their data to train a personalized predictive model. Our previous work investigated a cosine-similarity patient similarity metric (PSM) for such patient-specific predictive modeling. Objective The objective of the study is to investigate the random forest (RF) proximity measure as a PSM in the context of personalized mortality prediction for intensive care unit (ICU) patients. Methods A total of 17,152 ICU admissions were extracted from the Multiparameter Intelligent Monitoring in Intensive Care II database. A number of predictor variables were extracted from the first 24 hours in the ICU. Outcome to be predicted was 30-day mortality. A patient-specific predictive model was trained for each ICU admission using an RF PSM inspired by the RF proximity measure. Death counting, logistic regression, decision tree, and RF models were studied with a hard threshold applied to RF PSM values to only include the M most similar patients in model training, where M was varied. In addition, case-specific random forests (CSRFs), which uses RF proximity for weighted bootstrapping, were trained. Results Compared to our previous study that investigated a cosine similarity PSM, the RF PSM resulted in superior or comparable predictive performance. RF and CSRF exhibited the best performances (in terms of mean area under the receiver operating characteristic curve [95% confidence interval], RF: 0.839 [0.835-0.844]; CSRF: 0.832 [0.821-0.843]). RF and CSRF did not benefit from personalization via the use of the RF PSM, while the other models did. Conclusions The RF PSM led to good mortality prediction performance for several predictive models, although it failed to induce improved performance in RF and CSRF. The distinction between predictor and similarity variables is an important issue arising from the present study. RFs present a promising method for patient-specific outcome prediction.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here