Premium
On the usefulness of lexical and syntactic processing in polarity classification of T witter messages
Author(s) -
Vilares David,
Alonso Miguel A.,
GómezRodríguez Carlos
Publication year - 2015
Publication title -
journal of the association for information science and technology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.903
H-Index - 145
eISSN - 2330-1643
pISSN - 2330-1635
DOI - 10.1002/asi.23284
Subject(s) - classifier (uml) , computer science , polarity (international relations) , natural language processing , sentiment analysis , artificial intelligence , lexical analysis , context (archaeology) , linguistics , paleontology , philosophy , genetics , cell , biology
Millions of micro texts are published every day on T witter. Identifying the sentiment present in them can be helpful for measuring the frame of mind of the public, their satisfaction with respect to a product, or their support of a social event. In this context, polarity classification is a subfield of sentiment analysis focused on determining whether the content of a text is objective or subjective, and in the latter case, if it conveys a positive or a negative opinion. Most polarity detection techniques tend to take into account individual terms in the text and even some degree of linguistic knowledge, but they do not usually consider syntactic relations between words. This article explores how relating lexical, syntactic, and psychometric information can be helpful to perform polarity classification on S panish tweets. We provide an evaluation for both shallow and deep linguistic perspectives. Empirical results show an improved performance of syntactic approaches over pure lexical models when using large training sets to create a classifier, but this tendency is reversed when small training collections are used.