z-logo
open-access-imgOpen Access
Generating a Distilled N-Gram Set - Effective Lexical Multiword Building in the SPECIALIST Lexicon
Author(s) -
Chris J. Lu,
Destinee Tormey,
Lynn McCreedy,
Allen C. Browne
Publication year - 2017
Publication title -
proceedings of the 15th international joint conference on biomedical engineering systems and technologies
Language(s) - English
Resource type - Conference proceedings
DOI - 10.5220/0006142000770087
Subject(s) - lexicon , computer science , n gram , natural language processing , artificial intelligence , set (abstract data type) , gram , programming language , language model , biology , bacteria , genetics
Multiwords are vital to better Natural Language Processing (NLP) systems for more effective and efficient parsers, refining information retrieval searches, enhancing precision and recall in Medical Language Processing (MLP) applications, etc. The Lexical Systems Group has enhanced the coverage of multiwords in the Lexicon to provide a more comprehensive resource for such applications. This paper describes a new systematic approach to lexical multiword acquisition from MEDLINE through filters and matchers based on empirical models. The design goal, function description, various tests and applications of filters, matchers, and data are discussed. Results include: 1) Generating a smaller (38%) distilled MEDLINE n-gram set with better precision and similar recall to the MEDLINE n-gram set; 2) Establishing a system for generating high precision multiword candidates for effective Lexicon building. We believe the MLP/NLP community can benefit from access to these big data (MEDLINE n-gram) sets. We also anticipate an accelerated growth of multiwords in the Lexicon with this system. Ultimately, improvement in recall or precision can be anticipated in NLP projects using the MEDLINE distilled n-gram set, SPECIALIST Lexicon and its applications.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom