Premium
A hybrid multi‐objective firefly and simulated annealing based algorithm for big data classification
Author(s) -
Devi S. Gayathri,
Sabrigiriraj M.
Publication year - 2018
Publication title -
concurrency and computation: practice and experience
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.309
H-Index - 67
eISSN - 1532-0634
pISSN - 1532-0626
DOI - 10.1002/cpe.4985
Subject(s) - computer science , support vector machine , classifier (uml) , simulated annealing , naive bayes classifier , artificial intelligence , big data , machine learning , firefly algorithm , feature selection , data mining , algorithm , pattern recognition (psychology) , particle swarm optimization
Summary Efficient management of big data becomes challenging in recent decades. Online Feature Selection (OFS) is one type of online learning in contrast to batch learning, allowing a classifier to have small and fixed number of features. The major aim of this work is to introduce an OFS algorithm supported on meta‐heuristic algorithm that exploits the MapReduce paradigm. A novel Hybrid Multi‐Objective Firefly and Simulated Annealing (HMOFSA) algorithm is proposed to select optimal set of features. Therefore, as a first step, the original big dataset is decomposed into blocks of examples in the map phase. Subsequently, HMOFSA algorithm is employed to choose the selected features from examples. After that, the attained partial outcomes will be combined into a final vector of features in the reduce phase and evaluated using Kernel Support Vector Machine (KSVM) classifier. The mentioned OFS approach is analyzed with the help of the well‐known classifiers (Logistic Regression, KSVM and Naïve Bayes) developed within the Spark framework. Experiments were conducted on big datasets, containing 66 million samples and 2000 attributes that confirm the proficiency of proposed work. The proposed KSVM classifier results are measured in terms of the metrics like Precision, Recall, Geometric‐mean (G‐mean), F‐measure, and accuracy.