z-logo
open-access-imgOpen Access
Natural Language Processing for Assessing Quality Indicators in Free-Text Colonoscopy and Pathology Reports: Development and Usability Study
Author(s) -
Jung Ho Bae,
Hyun Wook Han,
Sun Young Yang,
Gyuseon Song,
Soonok Sa,
Gyung Ho Chung,
Ji Yeon Seo,
Eun Hyo Jin,
Hee Cheon Kim,
Donguk An
Publication year - 2022
Publication title -
jmir medical informatics
Language(s) - English
Resource type - Journals
ISSN - 2291-9694
DOI - 10.2196/35257
Subject(s) - colonoscopy , artificial intelligence , pipeline (software) , computer science , natural language processing , medicine , data set , information extraction , set (abstract data type) , medical physics , colorectal cancer , cancer , programming language
Background Manual data extraction of colonoscopy quality indicators is time and labor intensive. Natural language processing (NLP), a computer-based linguistics technique, can automate the extraction of important clinical information, such as adverse events, from unstructured free-text reports. NLP information extraction can facilitate the optimization of clinical work by helping to improve quality control and patient management. Objective We developed an NLP pipeline to analyze free-text colonoscopy and pathology reports and evaluated its ability to automatically assess adenoma detection rate (ADR), sessile serrated lesion detection rate (SDR), and postcolonoscopy surveillance intervals. Methods The NLP tool for extracting colonoscopy quality indicators was developed using a data set of 2000 screening colonoscopy reports from a single health care system, with an associated 1425 pathology reports. The NLP system was then tested on a data set of 1000 colonoscopy reports and its performance was compared with that of 5 human annotators. Additionally, data from 54,562 colonoscopies performed between 2010 and 2019 were analyzed using the NLP pipeline. Results The NLP pipeline achieved an overall accuracy of 0.99-1.00 for identifying polyp subtypes, 0.99-1.00 for identifying the anatomical location of polyps, and 0.98 for counting the number of neoplastic polyps. The NLP pipeline achieved performance similar to clinical experts for assessing ADR, SDR, and surveillance intervals. NLP analysis of a 10-year colonoscopy data set identified great individual variance in colonoscopy quality indicators among 25 endoscopists. Conclusions The NLP pipeline could accurately extract information from colonoscopy and pathology reports and demonstrated clinical efficacy for assessing ADR, SDR, and surveillance intervals in these reports. Implementation of the system enabled automated analysis and feedback on quality indicators, which could motivate endoscopists to improve the quality of their performance and improve clinical decision-making in colorectal cancer screening programs.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here