z-logo
open-access-imgOpen Access
Prioritization of Free-Text Clinical Documents: A Novel Use of a Bayesian Classifier
Author(s) -
Manmohan Singh,
Akansh Murthy,
Shridhar Singh
Publication year - 2015
Publication title -
jmir medical informatics
Language(s) - English
Resource type - Journals
ISSN - 2291-9694
DOI - 10.2196/medinform.3793
Subject(s) - computer science , naive bayes classifier , bayesian probability , preprocessor , posterior probability , artificial intelligence , data mining , classifier (uml) , machine learning , precision and recall , smoothing , support vector machine , computer vision
Background The amount of incoming data into physicians’ offices is increasing, thereby making it difficult to process information efficiently and accurately to maximize positive patient outcomes. Current manual processes of screening for individual terms within long free-text documents are tedious and error-prone. This paper explores the use of statistical methods and computer systems to assist clinical data management. Objective The objective of this study was to verify and validate the use of a naive Bayesian classifier as a means of properly prioritizing important clinical data, specifically that of free-text radiology reports. Methods There were one hundred reports that were first used to train the algorithm based on physicians’ categorization of clinical reports as high-priority or low-priority. Then, the algorithm was used to evaluate 354 reports. Additional beautification procedures such as section extraction, text preprocessing, and negation detection were performed. Results The algorithm evaluated the 354 reports with discrimination between high-priority and low-priority reports, resulting in a bimodal probability distribution. In all scenarios tested, the false negative rates were below 1.1% and the recall rates ranged from 95.65% to 98.91%. In the case of 50% prior probability and 80% threshold probability, the accuracy of this Bayesian classifier was 93.50%, with a positive predictive value (precision) of 80.54%. It also showed a sensitivity (recall) of 98.91% and a F-measure of 88.78%. Conclusions The results showed that the algorithm could be trained to detect abnormal radiology results by accurately screening clinical reports. Such a technique can potentially be used to enable automatic flagging of critical results. In addition to accuracy, the algorithm was able to minimize false negatives, which is important for clinical applications. We conclude that a Bayesian statistical classifier, by flagging reports with abnormal findings, can assist a physician in reviewing radiology reports more efficiently. This higher level of prioritization allows physicians to address important radiologic findings in a timelier manner and may also aid in minimizing errors of omission.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here