Open Access
FastText Word Embedding and Random Forest Classifier for User Feedback Sentiment Classification in Bahasa Indonesia
Author(s) -
Yehezkiel Gunawan,
Julio Christian Young,
Andre Rusli
Publication year - 2022
Publication title -
ultimatics : jurnal ilmu teknik informatika/ultimatics : jurnal teknik informatika
Language(s) - English
Resource type - Journals
eISSN - 2581-186X
pISSN - 2085-4552
DOI - 10.31937/ti.v13i2.2124
Subject(s) - word embedding , computer science , random forest , sentiment analysis , classifier (uml) , embedding , word (group theory) , natural language processing , artificial intelligence , software , machine learning , mathematics , geometry , programming language
User feedback nowadays become a platform for software developer to identify and understand user requirements, preferences, and user’s complaints. It is important for the developer to identify the problem that exist in user feedback. According to software growth, user amount also growth. Read and classify one by one manually are wasting time and energy. As the solution for the problem, sentiment analysis system using Random Forest Classifier which use word embedding as the feature extraction is made to help to classify which feedback is positive, neutral, or negative. Random Forest Algorithm is chosen because it gives the best performance, even its need the larger resources. Furthermore, with word embedding, the words which has semantic or syntactic similarities will be detected. Word embedding does not need stemming and stop word removal, so the context of the sentences keep remains. This research is made to implement word embedding to classify sentiment of user feedbacks using Random Forest Classifier. 70.27% accuracy, 80% precision, 54 recall and 54% F1 score is reached when BYU dataset (200 dimension) as embedding dataset with the train and test ratio 80:20.