
WORD2VEC AND BERT LANGUAGE MODELS USED FOR A SENTIMENT ANALYSIS OF TEXT POSTS IN SOCIAL NETWORKS
Author(s) -
Nadezhda Yarushkina,
Vadim Moshkin,
Andrei A. Konstantinov
Publication year - 2020
Publication title -
avtomatizaciâ processov upravleniâ
Language(s) - English
Resource type - Journals
ISSN - 1991-2927
DOI - 10.35752/1991-2927-2020-3-61-60-69
Subject(s) - word2vec , computer science , sentiment analysis , artificial intelligence , preprocessor , natural language processing , vectorization (mathematics) , feature (linguistics) , artificial neural network , information retrieval , machine learning , linguistics , philosophy , embedding , parallel computing
The paper proposes an original algorithm for the formation of a training sample for a neural network that provides a sentiment analysis of text posts in social networks. A feature of the algorithm is the use of the extended Russian-language semantic thesaurus WordNetAffect and the expert dictionary of author’s symbols for expressing emotions. In addition, the paper describes the application of a neural network based on the LSTM architecture to determine the emotional coloring of text messages on a social network using two text vectorization algorithms “word2vec” and “BERT”. As a result of the experiments, an indicator of the accuracy of determining the emotional coloring of messages of 87% was achieved using lemmatization as a text preprocessing algorithm and the BERT algorithm when converting it into a vector.