z-logo
open-access-imgOpen Access
A Performance Evaluation of Classification Algorithms for Big Data
Author(s) -
Mo Hai,
You Zhang,
Yuejin Zhang
Publication year - 2017
Publication title -
procedia computer science
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.334
H-Index - 76
ISSN - 1877-0509
DOI - 10.1016/j.procs.2017.11.479
Subject(s) - speedup , computer science , naive bayes classifier , spark (programming language) , random forest , bayes' theorem , algorithm , scale (ratio) , data mining , artificial intelligence , parallel computing , bayesian probability , support vector machine , physics , quantum mechanics , programming language
The performance of two typical classification algorithms in Spark: random forest and naive bayes are evaluated by using four metrics: classification accuracy, speedup, scaleup and sizeup. Experiments are performed on dataset and clusters of different scale. The results show that: (1) the accuracy of the two algorithms is high; (2) the increase of speedup is not linear. For the dataset with different size, the numbers of nodes is different when the speedup is the maximal; (3) the scaleup of random forest reaches its peak when the number of nodes is 2, and after that the scaleup decreases with the increase of the number of nodes;(4) for random forest, when the number of nodes is 2, the sizeup increases sharply with the increase of the size of dataset, and when the number of nodes is greater than 2, the sizeup increases more slowly with the increase of the size of dataset; for naive bayes, when the number of nodes is smaller than 6, the sizeup increases with the increase of the size of dataset, when number of nodes is 6 and the size of dataset is larger than that of Sogou_5, the change of the sizeup is not obvious with the increase of the size of dataset.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom