Leveraging User Ratings for Resource-poor Sentiment Classification
Author(s) -
Ngo Xuan Bach,
Từ Minh Phương
Publication year - 2015
Publication title -
procedia computer science
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.334
H-Index - 76
ISSN - 1877-0509
DOI - 10.1016/j.procs.2015.08.134
Subject(s) - computer science , vietnamese , artificial intelligence , sentiment analysis , class (philosophy) , machine learning , resource (disambiguation) , supervised learning , labeled data , data mining , natural language processing , artificial neural network , computer network , philosophy , linguistics
This paper presents a general, simple, yet effective method for weakly supervised sentiment classification in resource-poor lan- guages. Given as input weak training signals in forms of textual reviews and associated ratings, which are available in many e-commerce websites, our method computes class distributions for sentences using the statistical information of n-grams in the reviews. These distributions can then be used directly to build sentiment classifiers in unsupervised settings, or they can be used as extra features to boost the classification accuracy in semi-supervised settings. We empirically verified the effectiveness of the proposed method on two datasets in Japanese and Vietnamese languages. The results are promising, showing that the method is able to make relatively accurate predictions even when no labeled data are given. In the semi-supervised settings, the method achieved from 1.8% to 4.7% relative improvement over the pure supervised baseline method, depending on the amount of labeled data
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom