An Experimental Study on Sentiment Classification of Algerian Dialect Texts
Author(s) -
Leila Moudjari,
Karima AkliAstouati
Publication year - 2020
Publication title -
procedia computer science
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.334
H-Index - 76
ISSN - 1877-0509
DOI - 10.1016/j.procs.2020.09.111
Subject(s) - computer science , artificial intelligence , sentiment analysis , lexicon , baseline (sea) , natural language processing , preprocessor , principle of maximum entropy , word embedding , deep learning , random forest , machine learning , popularity , binary classification , support vector machine , embedding , psychology , social psychology , oceanography , geology
The aim of this paper is to study and compare some well-known and commonly used methods for sentiment analysis to evaluate the opinion and emotion expressed in Algerian texts. The classification task herein is a ternary sentiment classification. By using several combinations of text preprocessing and data representation techniques, we aim to compare the precise modelling results of Deep Learning models with other commonly used algorithms (random forest, maximum entropy, SVM, and the lexicon-based method for which we tested several lexicons). Based on the experiments carried out, Deep Learning models clearly outperform the baseline and offer better accuracy especially for CNN. In order to improve modelling results, we set a new baseline for future works. This is the integrated embeddings in the training model. We experimented with different models and data representations, including a recent approach, the ” contextual embedding” which appeared in 2018 and gained popularity in the NLP community in 2019. Our results give openings for further research in this domain.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom