z-logo
open-access-imgOpen Access
Artificial Intelligence Approach for Variant Reporting
Author(s) -
Michael G Zomnir,
Lev Lipkin,
Maciej Pacula,
Enrique Meneses,
Allison Macleay,
Sekhar Duraisamy,
Nishchal Nadhamuni,
Saeed H Al Turki,
Zongli Zheng,
Miguel N. Rivera,
Valentirdi,
Dora DiasSantagata,
A. John Iafrate,
Long P. Le,
Jochen K. Lennerz
Publication year - 2018
Publication title -
jco clinical cancer informatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.188
H-Index - 12
ISSN - 2473-4276
DOI - 10.1200/cci.16.00079
Subject(s) - interpretability , random forest , artificial intelligence , logistic regression , computer science , machine learning , decision tree , pipeline (software) , youden's j statistic , naive bayes classifier , predictive modelling , data mining , support vector machine , receiver operating characteristic , programming language
Purpose Next-generation sequencing technologies are actively applied in clinical oncology. Bioinformatics pipeline analysis is an integral part of this process; however, humans cannot yet realize the full potential of the highly complex pipeline output. As a result, the decision to include a variant in the final report during routine clinical sign-out remains challenging.Methods We used an artificial intelligence approach to capture the collective clinical sign-out experience of six board-certified molecular pathologists to build and validate a decision support tool for variant reporting. We extracted all reviewed and reported variants from our clinical database and tested several machine learning models. We used 10-fold cross-validation for our variant call prediction model, which derives a contiguous prediction score from 0 to 1 (no to yes) for clinical reporting.Results For each of the 19,594 initial training variants, our pipeline generates approximately 500 features, which results in a matrix of > 9 million data points. From a comparison of naive Bayes, decision trees, random forests, and logistic regression models, we selected models that allow human interpretability of the prediction score. The logistic regression model demonstrated 1% false negativity and 2% false positivity. The final models’ Youden indices were 0.87 and 0.77 for screening and confirmatory cutoffs, respectively. Retraining on a new assay and performance assessment in 16,123 independent variants validated our approach (Youden index, 0.93). We also derived individual pathologist-centric models (virtual consensus conference function), and a visual drill-down functionality allows assessment of how underlying features contributed to a particular score or decision branch for clinical implementation.Conclusion Our decision support tool for variant reporting is a practically relevant artificial intelligence approach to harness the next-generation sequencing bioinformatics pipeline output when the complexity of data interpretation exceeds human capabilities.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here