Training set augmentation in training neural- network language model for ontology population | Zendy

Павел Ломов | Zendy; Marina Malozemova | Zendy; AUTHOR_ID | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Training set augmentation in training neural- network language model for ontology population

Author(s) -

Павел Ломов,

AUTHOR_ID,

Marina Malozemova,

AUTHOR_ID

Publication year - 2021

Publication title -

trudy kolʹskogo naučnogo centra ran

Language(s) - English

Resource type - Journals

ISSN - 2307-5252

DOI - 10.37614/2307-5252.2021.5.12.002

Subject(s) - computer science , ontology , set (abstract data type) , artificial neural network , artificial intelligence , natural language processing , training set , domain (mathematical analysis) , training (meteorology) , population , machine learning , programming language , mathematics , mathematical analysis , philosophy , physics , demography , epistemology , sociology , meteorology

This paper is a continuation of the research focused on solving the problem of ontology population using training on an automatically generated training set and the subsequent use of a neural-network language model for analyzing texts in order to discover new concepts to add to the ontology. The article is devoted to the text data augmentation - increasing the size of the training set by modification of its samples. Along with this, a solution to the problem of clarifying concepts (i.e. adjusting their boundaries in sentences), which were found during the automatic formation of the training set, is considered. A brief overview of existing approaches to text data augmentation, as well as approaches to extracting so-called nested named entities (nested NER), is presented. A procedure is proposed for clarifying the boundaries of the discovered concepts of the training set and its augmentation for subsequent training a neural-network language model in order to identify new concepts of ontology in the domain texts. The results of the experimental evaluation of the trained model and the main directions of further research are considered.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore