
THE EXTRACTION OF TERMS CONSISTING OF SEVERAL WORDS FROM TEXTS IN NATURAL LANGUAGES USING THE SYNTACTIC PATTERNS
Author(s) -
Aleksey M. Namestnikov,
Aleksey Filippov,
Islam M. Shigabutdinov
Publication year - 2021
Publication title -
avtomatizaciâ processov upravleniâ
Language(s) - English
Resource type - Journals
ISSN - 1991-2927
DOI - 10.35752/1991-2927-2021-3-65-87-95
Subject(s) - computer science , natural language processing , artificial intelligence , software , parsing , natural language , interface (matter) , software system , linguistics , programming language , philosophy , bubble , maximum bubble pressure method , parallel computing
Two problems arise when extracting terms consisting of several words using linguistic methods of text analysis: 1. A linguist has no skills in software systems development, however he (she) is required to present his (her) knowledge in the form of software system fragments or constructions in a formal language. 2. Most software developers are not qualified enough in linguistics. This problem creates a semantic gap between the methods of linguistic analysis of texts and their software implementation. The article presents an approach to extract the terms consisting of several words based on syntactic patterns tailored for a linguist. The proposed approach does not require additional skills and usage of various languages to describe syntactic patterns by a linguist. The prototype of the software system was developed. The software system allows describing syntactic patterns without having knowledge of a formal language. Moreover, as against the analogs the developed system is capable to use syntactic patterns in external systems for text analysis. The server of the prototype has an interface to make the syntactic patterns.