
Assessing the predictive ability of the Suicide Crisis Inventory for near‐term suicidal behavior using machine learning approaches
Author(s) -
Parghi Neelang,
Chennapragada Lakshmi,
Barzilay Shira,
Newkirk Saskia,
Ahmedani Brian,
Lok Benjamin,
Galynker Igor
Publication year - 2021
Publication title -
international journal of methods in psychiatric research
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.275
H-Index - 73
eISSN - 1557-0657
pISSN - 1049-8931
DOI - 10.1002/mpr.1863
Subject(s) - random forest , oversampling , recall , logistic regression , receiver operating characteristic , machine learning , artificial intelligence , gradient boosting , psychology , brier score , boosting (machine learning) , poison control , statistics , computer science , mathematics , medicine , medical emergency , cognitive psychology , computer network , bandwidth (computing)
Objective This study explores the prediction of near‐term suicidal behavior using machine learning (ML) analyses of the Suicide Crisis Inventory (SCI), which measures the Suicide Crisis Syndrome, a presuicidal mental state. Methods SCI data were collected from high‐risk psychiatric inpatients ( N = 591) grouped based on their short‐term suicidal behavior, that is, those who attempted suicide between intake and 1‐month follow‐up dates ( N = 20) and those who did not ( N = 571). Data were analyzed using three predictive algorithms (logistic regression, random forest, and gradient boosting) and three sampling approaches (split sample, Synthetic minority oversampling technique, and enhanced bootstrap). Results The enhanced bootstrap approach considerably outperformed the other sampling approaches, with random forest (98.0% precision; 33.9% recall; 71.0% Area under the precision‐recall curve [AUPRC]; and 87.8% Area under the receiver operating characteristic [AUROC]) and gradient boosting (94.0% precision; 48.9% recall; 70.5% AUPRC; and 89.4% AUROC) algorithms performing best in predicting positive cases of near‐term suicidal behavior using this dataset. Conclusions ML can be useful in analyzing data from psychometric scales, such as the SCI, and for predicting near‐term suicidal behavior. However, in cases such as the current analysis where the data are highly imbalanced, the optimal method of measuring performance must be carefully considered and selected.