Customizable Natural Language Processing Biomarker Extraction Tool
Author(s) -
Benjamin Holmes,
Dhananjay Chitale,
Joshua Loving,
Mary Tran,
Vinod Subramanian,
Anna B. Berry,
Matthew J. Rioth,
Raghu Warrier,
Thomas D. Brown
Publication year - 2021
Publication title -
jco clinical cancer informatics
Language(s) - English
Resource type - Journals
ISSN - 2473-4276
DOI - 10.1200/cci.21.00017
Subject(s) - biomarker , context (archaeology) , computer science , precision medicine , information extraction , natural language processing , artificial intelligence , breast cancer , terminology , medicine , cancer , pathology , biology , linguistics , philosophy , paleontology , biochemistry
PURPOSE Natural language processing (NLP) in pathology reports to extract biomarker information is an ongoing area of research. MetaMap is a natural language processing tool developed and funded by the National Library of Medicine to map biomedical text to the Unified Medical Language System Metathesaurus by applying specific tags to clinically relevant terms. Although results are useful without additional postprocessing, these tags lack important contextual information.METHODS Our novel method takes terminology-driven semantic tags and incorporates those into a semantic frame that is task-specific to add necessary context to MetaMap. We use important contextual information to capture biomarker results to support Community Health System's use of Precision Medicine treatments for patients with cancer. For each biomarker, the name, type, numeric quantifiers, non-numeric qualifiers, and the time frame are extracted. These fields then associate biomarkers with their context in the pathology report such as test type, probe intensity, copy-number changes, and even failed results. A selection of 6,713 relevant reports contained the following standard-of-care biomarkers for metastatic breast cancer: breast cancer gene 1 and 2, estrogen receptor, progesterone receptor, human epidermal growth factor receptor 2, and programmed death-ligand 1.RESULTS The method was tested on pathology reports from the internal pathology laboratory at Henry Ford Health System. A certified tumor registrar reviewed 400 tests, which showed > 95% accuracy for all extracted biomarker types.CONCLUSION Using this new method, it is possible to extract high-quality, contextual biomarker information, and this represents a significant advance in biomarker extraction.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom