Premium
Machine learning‐enabled multitrust audit of stroke comorbidities using natural language processing
Author(s) -
Shek Anthony,
Jiang Zhilin,
Teo James,
Au Yeung Joshua,
Bhalla Ajay,
Richardson Mark P.,
Mah Yee
Publication year - 2021
Publication title -
european journal of neurology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.881
H-Index - 124
eISSN - 1468-1331
pISSN - 1351-5101
DOI - 10.1111/ene.15071
Subject(s) - medicine , audit , machine learning , artificial intelligence , data collection , atrial fibrillation , stroke (engine) , computer science , mechanical engineering , statistics , mathematics , management , engineering , economics
Abstract Background and purpose With the increasing adoption of electronic records in the health system, machine learning‐enabled techniques offer the opportunity for greater computer‐assisted curation of these data for audit and research purposes. In this project, we evaluate the consistency of traditional curation methods used in routine clinical practice against a new machine learning‐enabled tool, MedCAT, for the extraction of the stroke comorbidities recorded within the UK's Sentinel Stroke National Audit Programme (SSNAP) initiative. Methods A total of 2327 stroke admission episodes from three different National Health Service (NHS) hospitals, between January 2019 and April 2020, were included in this evaluation. In addition, current clinical curation methods (SSNAP) and the machine learning‐enabled method (MedCAT) were compared against a subsample of 200 admission episodes manually reviewed by our study team. Performance metrics of sensitivity, specificity, precision, negative predictive value, and F1 scores are reported. Results The reporting of stroke comorbidities with current clinical curation methods is good for atrial fibrillation, hypertension, and diabetes mellitus, but poor for congestive cardiac failure. The machine learning‐enabled method, MedCAT, achieved better performances across all four assessed comorbidities compared with current clinical methods, predominantly driven by higher sensitivity and F1 scores. Conclusions We have shown machine learning‐enabled data collection can support existing clinical and service initiatives, with the potential to improve the quality and speed of data extraction from existing clinical repositories. The scalability and flexibility of these new machine‐learning tools, therefore, present an opportunity to revolutionize audit and research methods.