
Acronym Disambiguation using Web Scraping
Author(s) -
K. Premkumar,
V. Atchayaa,
P. Idayavalli,
R. Gayathri
Publication year - 2020
Publication title -
international journal of engineering and advanced technology
Language(s) - English
Resource type - Journals
ISSN - 2249-8958
DOI - 10.35940/ijeat.d6812.049420
Subject(s) - acronym , computer science , paragraph , context (archaeology) , natural language processing , cosine similarity , artificial intelligence , similarity (geometry) , information retrieval , world wide web , pattern recognition (psychology) , linguistics , paleontology , philosophy , image (mathematics) , biology
Web Scraping is one of the current technologies that uses scraping tools to perform tasks similar to humans. It is adopted in many applications like e-commerce, dataset creating in machine learning, advertising etc. This work focuses on acronym disambiguation which is part of natural language processing. Acronym disambiguation is mainly used in chat bot, named entity recognition, natural language processing and so on. In this paper, an acronym disambiguation system is built by web scraping using Jsoup and cosine similarity score is used to identify the most suitable acronym. Our goal is to identify the acronym suitable for the abbreviation based on context of the paragraph where it lies. For this we use cosine similarity to calculate the score, the acronym which obtains maximum score is the concluded as suitable expansion