Assessing naive Bayes and support vector machine performance in sentiment classification on a big data platform | Zendy

Redouane Karsi | Zendy; Mounia Zaim | Zendy; Jamila El Alami | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Assessing naive Bayes and support vector machine performance in sentiment classification on a big data platform

Author(s) -

Redouane Karsi,

Mounia Zaim,

Jamila El Alami

Publication year - 2021

Publication title -

iaes international journal of artificial intelligence

Language(s) - English

Resource type - Journals

eISSN - 2252-8938

pISSN - 2089-4872

DOI - 10.11591/ijai.v10.i4.pp990-996

Subject(s) - computer science , spark (programming language) , naive bayes classifier , support vector machine , machine learning , artificial intelligence , big data , volume (thermodynamics) , data mining , database , physics , quantum mechanics , programming language

Nowadays, mining user reviews becomes a very useful mean for decision making in several areas. Traditionally, machine learning algorithms have been widely and effectively used to analyze user’s opinions on a limited volume of data. In the case of massive data, powerful hardware resources (CPU, memory, and storage) are essential for dealing with the whole data processing phases including, collection, pre-processing, and learning in an optimal time. Several big data technologies have emerged to efficiently process massive data, like Apache Spark, which is a distributed framework for data processing that provides libraries implementing several machine learning algorithms. In order to evaluate the performance of Apache Spark's machine learning library (MLlib) on a large volume of data, classification accuracies and processing time of two machine learning algorithms implemented in spark: naive B ayes and support vector machine (SVM) are compared to the performance achieved by the standard implementation of these two algorithms on large different size datasets built from movie reviews. The results of our experiment show that the performance of classifiers running under spark is higher than traditional ones and reaches F-measure greater than 84%. At the same time, we found that under spark framework, the learning time is relatively low.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore