A Large-Scale Sentiment Data Classification for Online Reviews Under Apache Spark | Zendy

Samar Al-Saqqa | Zendy; Ghazi AlNaymat | Zendy; Arafat Awajan | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

A Large-Scale Sentiment Data Classification for Online Reviews Under Apache Spark

Author(s) -

Samar Al-Saqqa,

Ghazi AlNaymat,

Arafat Awajan

Publication year - 2018

Publication title -

procedia computer science

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.334

H-Index - 76

ISSN - 1877-0509

DOI - 10.1016/j.procs.2018.10.166

Subject(s) - computer science , naive bayes classifier , spark (programming language) , support vector machine , machine learning , logistic regression , artificial intelligence , sentiment analysis , scalability , classifier (uml) , scale (ratio) , data mining , metric (unit) , database , operations management , physics , quantum mechanics , economics , programming language

Sentiment Analysis of large-scale data has become increasingly important and has attracted many researchers, urging them to use new platforms and tools that can handle large volumes of data. In this paper, we present new evaluation experiments of sentiment analysis for a large-scale dataset of online customer’s reviews under Apache Spark data Processing System. Apache Spark’s scalable machine learning library (MLlib) is used and three classification techniques from the library are applied; Naive Bayes, Support vector machine, and logistic regression. The results are evaluated using the accuracy metric. Experimental results show that Support vector machine classifier outperforms Naive Bayes and logistic regression classifiers.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research