z-logo
open-access-imgOpen Access
THE ROLE OF THE UDMURT SPELL CHECKER IN REPLENISHMENT OF THE UDMURT NATIONAL CORPUS
Author(s) -
Мария Петровна Безенова,
Григорий Леонидович Григорьев
Publication year - 2020
Publication title -
ežegodnik finno-ugorskih issledovanij
Language(s) - English
Resource type - Journals
eISSN - 2311-0333
pISSN - 2224-9443
DOI - 10.35634/2224-9443-2020-14-3-549-556
Subject(s) - spelling , corpus linguistics , computer science , linguistics , spell , affix , vocabulary , philology , british national corpus , artificial intelligence , library science , history , sociology , gender studies , philosophy , feminism , anthropology
Corpus linguistics is currently one of the most popular sections of linguistics. Most of the major languages of the world today already have their own digital corpora of tens and hundreds of millions of word usage. Recently, special attention has also been paid to the creation of text corpus in the languages of the peoples of Russia, since, on the one hand, corpus research allows you to look at the structure of the language from a completely different perspective, on the other hand, the corpus is a kind of form of storing language data. The article describes the Udmurt National Corpus, which has been developed since the end of 2019 by the staff of the philological research department of the Udmurt Institute of History, Language and Literature of the Udmurt Federal Research Center of the Ural Branch of the Russian Academy of Sciences. It speaks in detail about the capabilities of the information and reference system being created at the moment, as well as about the prospects for using the corpus of texts when conducting research, preparing dictionaries, and creating various programs in the Udmurt language. The article also deals with the Hunspell-based Udmurt spell checker developed by Grigory Grigoriev, which plays an important role in replenishing the Udmurt National Corps. Before uploading new texts to the site, all of them are subjected to a mandatory check for spelling errors that could remain during their proofreading. This extension for text editors, thanks to the vocabulary database associated with the affix file, which contains all possible morphological variants of the lexemes of the main dictionary, identifies spelling errors in the text, allowing you to upload the most verified texts to the website of the Udmurt National Corpus.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here