
COMBINING FACTS, SEMANTIC ROLES AND SENTIMENT LEXICON IN A GENERATIVE MODEL FOR OPINION MINING
Author(s) -
Doron Feldman,
Tasnima Sadekova,
Konstantin Vorontsov
Publication year - 2020
Publication title -
kompʹûternaâ lingvistika i intellektualʹnye tehnologii
Language(s) - English
Resource type - Conference proceedings
ISSN - 2075-7182
DOI - 10.28995/2075-7182-2020-19-283-298
Subject(s) - computer science , sentiment analysis , lexicon , natural language processing , task (project management) , artificial intelligence , generative grammar , set (abstract data type) , test set , test (biology) , generative model , information retrieval , paleontology , management , economics , biology , programming language
Opinion mining is a popular task, that is applied, for example, to determine news polarisation and identify product review classes. Our task is unsupervised clusterization of opinionated texts, in particular news on political events. Many papers that tackle this issue use generative models based on lexical features. Our goal is to determine the entities defying an opinion amongst lexical, syntactic and semantic features as well as their compositions. More specifically, we test the hypothesis that an opinion is determined by the composition of the mentioned facts (SPO triples), the semantic roles of the words and the sentiment lexicon used in it. In this paper we formalise this task and prove that using a composition of the above features provides the best quality when clusterising opinionated texts. To test this hypothesis we have gathered and labelled two corpuses of news on political events and proposed a set of unsupervised algorithms for extracting the features.