Automatic Linking of Terms from Scientific Texts with Knowledge Base Entities | Zendy

A. A. Mezentseva | Zendy; Elena Bruches | Zendy; Tatiana Batura | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Automatic Linking of Terms from Scientific Texts with Knowledge Base Entities

Author(s) -

A. A. Mezentseva,

Elena Bruches,

Tatiana Batura

Publication year - 2021

Publication title -

vestnik novosibirskogo gosudarstvennogo universiteta. seriâ: informacionnye tehnologii/vestnik novosibirskogo gosudarstvennogo universiteta. seriâ: informacionnye tehnologii v obrazovanii

Language(s) - English

Resource type - Journals

eISSN - 2410-0420

pISSN - 1818-7900

DOI - 10.25205/1818-7900-2021-19-2-65-75

Subject(s) - computer science , knowledge base , information retrieval , entity linking , term (time) , set (abstract data type) , task (project management) , ranking (information retrieval) , context (archaeology) , quality (philosophy) , natural language processing , base (topology) , string (physics) , rank (graph theory) , matching (statistics) , artificial intelligence , paleontology , mathematical analysis , philosophy , statistics , physics , mathematics , management , epistemology , quantum mechanics , combinatorics , biology , economics , programming language

Due to the growth of the number of scientific publications, the tasks related to scientific article processing become more actual. Such texts have a special structure, lexical and semantic content that should be taken into account while processing. Using information from knowledge bases can significantly improve the quality of text processing systems. This paper is dedicated to the entity linking task for scientific articles in Russian, where we consider scientific terms as entities. During our work, we annotated a corpus with scientific texts, where each term was linked with an entity from a knowledge base. Also, we implemented an algorithm for entity linking and evaluated it on the corpus. The algorithm consists of two stages: candidate generation for an input term and ranking this set of candidates to choose the best match. We used string matching of an input term and an entity in a knowledge base to generate a set of candidates. To rank the candidates and choose the most relevant entity for a term, information about the number of links to other entities within the knowledge base and to other sites is used. We analyzed the obtained results and proposed possible ways to improve the quality of the algorithm, for example, using information about the context and a knowledge base structure. The annotated corpus is publicly available and can be useful for other researchers.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore