
Sentiment Analysis in Portuguese Texts from Online Health Community Forums: Data, Model and Evaluation
Author(s) -
Yohan Bonescki Gumiel,
Isabela Braga Lee,
Tayane Arantes Soares,
Thiago Castro Ferreira,
Adriana Pagano
Publication year - 2021
Language(s) - English
Resource type - Conference proceedings
DOI - 10.5753/stil.2021.17785
Subject(s) - sentiment analysis , computer science , portuguese , random forest , support vector machine , decision tree , task (project management) , artificial intelligence , logistic regression , natural language processing , machine learning , online community , world wide web , linguistics , philosophy , management , economics
This study introduces novel data and models for the task of Sentiment Analysis in Portuguese texts about Diabetes Mellitus. The corpus contains 1290 posts retrieved from online health community forums in Portuguese and annotated by two annotators according to 3 sentiment categories (e.g. Positive, Neutral and Negative). Evaluation of traditional (Support Vector Machine, Decision Tree, Random Forest and Logistic Regression classifiers) and state-ofthe-art (BERT-based models) machine learning classifiers for the task showed the advantage in performance of the latter models as expected. Data and models are available to the community upon request.