
Sentiment classification of hotel service review on traveloka sites using naïve bayes classifier (NBC) and binary logistic regression
Author(s) -
Silvia Astri Rahmaningrum,
Pratnya Paramitha Oktaviana
Publication year - 2020
Publication title -
journal of physics. conference series
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.21
H-Index - 85
eISSN - 1742-6596
pISSN - 1742-6588
DOI - 10.1088/1742-6596/1490/1/012065
Subject(s) - naive bayes classifier , logistic regression , computer science , sentiment analysis , classifier (uml) , artificial intelligence , data mining , binary classification , machine learning , support vector machine
Utilization of social media sites such as Traveloka can help hotel marketing by providing information as review related to search and hotel bookings online. The given review could be a feedback to the related hotel as well as to assist visitors in choosing the right hotel. Feedback information as a review is important data text, so we need to develop a method to classifiy it then the right method used for classifying text data is text mining. Sources of data are obtained from web scraping process which aims to obtain online data on website page by collecting visitor review’s of Favehotel and Gunawangsa Hotel from Traveloka sites. Text mining methods used in this study is Naïve Bayes Classifier (NBC) and Binary Logistic Regression which is the process of sentiment labeling based on Lexicon dictionary. Word cloud visualization shows that the highest keywords that lead to the both of the hotel with the largest positive sentiment are ‘clean’ and ‘comfortable’. A comparison methods between NBC and Binary Logistic Regression for Favehotel and Gunawangsa Hotel obtained a decision that Binary Logistic Regression method with SMOTE was better than NBC where AUC value in testing data for Favehotel is 0.84 and Hotel Gunawangsa is 0.82.