
An Efficient Method for Biomedical Word Sense Disambiguation Based On Web-Kernel Similarity
Publication year - 2021
Publication title -
international journal of healthcare information systems and informatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.266
H-Index - 13
eISSN - 1555-340X
pISSN - 1555-3396
DOI - 10.4018/ijhisi.20211001oa01
Subject(s) - computer science , information retrieval , semantic similarity , context (archaeology) , weighting , similarity (geometry) , term (time) , word (group theory) , natural language processing , artificial intelligence , set (abstract data type) , kernel (algebra) , similarity measure , ontology , measure (data warehouse) , data mining , mathematics , medicine , paleontology , philosophy , physics , geometry , radiology , epistemology , quantum mechanics , combinatorics , image (mathematics) , biology , programming language
Searching for the best sense for a polysemous word remains one of the greatest challenges in the representation of biomedical text. To this end, Word Sense Disambiguation (WSD) algorithms mostly rely on an External Source of Knowledge, like a Thesaurus or Ontology, for automatically selecting the proper concept of an ambiguous term in a given Window of Context using semantic similarity and relatedness measures. In this paper, we propose a Web-based Kernel function for measuring the semantic relatedness between concepts to disambiguate an expression versus multiple possible concepts. This measure uses the large volume of documents returned by PubMed Search engine to determine the greater context for a biomedical short text through a new term weighting scheme based on Rough Set Theory (RST). To illustrate the efficiency of our proposed method, we evaluate a WSD algorithm based on this measure on a biomedical dataset (MSH-WSD) that contains 203 ambiguous terms and acronyms. The obtained results demonstrate promising improvements.