Emostemmer: An Effective Program for Determining Emotions in Russian Using N-grams (Emotiograms) | Zendy

Mohsin Manshad Abbasi | Zendy; A. P. Bel'tyukov | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Emostemmer: An Effective Program for Determining Emotions in Russian Using N-grams (Emotiograms)

Author(s) -

Mohsin Manshad Abbasi,

A. P. Bel'tyukov

Publication year - 2021

Publication title -

intellektualʹnye sistemy v proizvodstve

Language(s) - English

Resource type - Journals

eISSN - 2410-9304

pISSN - 1813-7911

DOI - 10.22213/2410-9304-2021-4-148-157

Subject(s) - computer science , natural language processing , identification (biology) , parsing , root (linguistics) , artificial intelligence , word (group theory) , government (linguistics) , linguistics , philosophy , botany , biology

Emotions and the analysis of their expression in texts is a topic of growing interest in recent years. Researchers are trying to create an intelligent machine that can not only read the text, but also determine its emotional state. The results obtained can be used to prepare the machine for future predictions of the emotional orientation of texts, their authors and readers. This text analysis can also be used to get feedback from people about a product or service, reaction to an event or government policy, etc. It includes syntactic as well as semantic text analysis. Parsing consists of identifying words that represent emotions in a text. For its identification, the stemmer plays an important role - the stem or root of the word. In many languages of the Romano-Germanic group, the identification of words representing emotions is much easier than in Russian, since one word represents emotion regardless of grammatical forms and genders. While for a language such as Russian, where the ending of an emotionally charged word changes depending on the genus, species, etc., the analysis becomes more complex. There are different methods of defining emotions in a text. This work focuses on identifying emotions from the text while limiting the complexity of the algorithm by requiring a minimum amount of memory and time. We have created the Emostemmer program, which is an N-gram stemmer (in which letters from words are grouped in a sequence of 2 letters, 3 letters… ..N letters called N-grams) to identify words that represent emotions in the text. The performance of Emostemmer versus RuSentiLex was determined by training and testing a support vector machine classifier with both algorithms. The results of the work are described in detail below in the “Methodology” and “Discussion” sections.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore