Open Access
Interrogation of Sentiment Perusing with Hash Counting Vectorizer and Term Inverse Frequency Transformer using Machine Learning Classification
Author(s) -
K.V.S.N. Rama Rao,
M. Shyamala Devi
Publication year - 2019
Publication title -
international journal of recent technology and engineering
Language(s) - English
Resource type - Journals
ISSN - 2277-3878
DOI - 10.35940/ijrte.d8303.118419
Subject(s) - computer science , random forest , naive bayes classifier , artificial intelligence , classifier (uml) , decision tree , machine learning , sentiment analysis , python (programming language) , support vector machine , data mining , pattern recognition (psychology) , operating system
With the fast growing technology, the business is moving towards increasing their profit by interpreting the customer satisfaction. The customer satisfaction can be analyzed in many ways. It is the responsibility of the business to analyze the customer satisfaction in order to improve their turnover and profit. With the current trend, the customers are giving their feedback through mobile and internet. With this overview, this paper attempts to analyze the sentiment of the customer feedback for the movie. The sentiment Analysis on movie Review dataset from the KAGGLE Machine learning repository is used for implementation. The type of sentiment classes is predicted through the following ways. Firstly, the sentiment count for each class is displayed and the top feature words for each sentiment class are also extracted from the dataset. Secondly, the dataset is sampled with counting vectorizer and then fitted with the classifiers like Logistic Regression Classifier, Linear SVM Classifier, Multinomial Naives Bayes Classifier, Gradient Boosting Classifer, Guassian Naive Bayes Classifier, Random Forest Classifier, Decision Tree Classifier and and Extra Tree Classifier. Thirdly, the dataset is sampled with Hashing vectorizer and then fitted with the above specified classifiers. Fourth, the dataset is sampled with TFIFD vectorizer and then fitted with the above specified classifiers. Fifth, the dataset is sampled with TFIFD Transformer and then fitted with the above specified classifiers. Sixth, the Performance analysis of classifiers is performed by analyzing the metrics like Precision, Recall, Fscore and Accuracy. The implementation is carried out using python code in Spyder Anaconda Navigator IP Console. Experimental results shows that the analysis of sentiment done by the random forest classifier is found to be more effective with the Accuracy of 89% for Counting vectorizer and TFIFD transformer, Accuracy of 87% for Hashing vectorizer and Accuracy of 88% for TFIFD vectorizer.