
BLENDING FASTTEXT AND BERT PREDICTIONS FOR TAXONOMY ENRICHMENT
Author(s) -
Dmitry Puzyrev,
Ekaterina Artemova,
Artem Shelmanov,
Alexander Panchenko
Publication year - 2020
Publication title -
kompʹûternaâ lingvistika i intellektualʹnye tehnologii
Language(s) - English
Resource type - Conference proceedings
ISSN - 2075-7182
DOI - 10.28995/2075-7182-2020-19-1117-1122
Subject(s) - taxonomy (biology) , computer science , natural language processing , information retrieval , task (project management) , similarity (geometry) , artificial intelligence , image (mathematics) , botany , biology , management , economics
In this paper, we present one of the solutions to the Taxonomy Enrichment shared task co-located with the Dialogue conference. The proposed method blends distributional information from fastText and BERT word embeddings to predict the most likely parent hypernym node for a new term in a taxonomy. More specifically, we are using both the information on hypernym frequency among the most similar entries in the taxonomy and the similarity of hypernyms themselves. DeepPavlov-based fastText and RuBERT finetuned on news texts and Russian Wikipedia achieve a MAP of 0.3939 and MRR of 0.4353.