THE ROLE OF THE UDMURT SPELL CHECKER IN REPLENISHMENT OF THE UDMURT NATIONAL CORPUS | Zendy

Мария Петровна Безенова | Zendy; Григорий Леонидович Григорьев | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

THE ROLE OF THE UDMURT SPELL CHECKER IN REPLENISHMENT OF THE UDMURT NATIONAL CORPUS

Author(s) -

Мария Петровна Безенова,

Григорий Леонидович Григорьев

Publication year - 2020

Publication title -

ežegodnik finno-ugorskih issledovanij

Language(s) - English

Resource type - Journals

eISSN - 2311-0333

pISSN - 2224-9443

DOI - 10.35634/2224-9443-2020-14-3-549-556

Subject(s) - spelling , corpus linguistics , computer science , linguistics , spell , affix , vocabulary , philology , british national corpus , artificial intelligence , library science , history , sociology , gender studies , philosophy , feminism , anthropology

Corpus linguistics is currently one of the most popular sections of linguistics. Most of the major languages of the world today already have their own digital corpora of tens and hundreds of millions of word usage. Recently, special attention has also been paid to the creation of text corpus in the languages of the peoples of Russia, since, on the one hand, corpus research allows you to look at the structure of the language from a completely different perspective, on the other hand, the corpus is a kind of form of storing language data. The article describes the Udmurt National Corpus, which has been developed since the end of 2019 by the staff of the philological research department of the Udmurt Institute of History, Language and Literature of the Udmurt Federal Research Center of the Ural Branch of the Russian Academy of Sciences. It speaks in detail about the capabilities of the information and reference system being created at the moment, as well as about the prospects for using the corpus of texts when conducting research, preparing dictionaries, and creating various programs in the Udmurt language. The article also deals with the Hunspell-based Udmurt spell checker developed by Grigory Grigoriev, which plays an important role in replenishing the Udmurt National Corps. Before uploading new texts to the site, all of them are subjected to a mandatory check for spelling errors that could remain during their proofreading. This extension for text editors, thanks to the vocabulary database associated with the affix file, which contains all possible morphological variants of the lexemes of the main dictionary, identifies spelling errors in the text, allowing you to upload the most verified texts to the website of the Udmurt National Corpus.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore