
Feature Selection using Stochastic Diffusion Search Algorithm in Big Data Analysis
Publication year - 2020
Publication title -
international journal of recent technology and engineering
Language(s) - English
Resource type - Journals
ISSN - 2277-3878
DOI - 10.35940/ijrte.d1051.1284s519
Subject(s) - computer science , feature selection , feature (linguistics) , data mining , support vector machine , naive bayes classifier , big data , heuristic , artificial intelligence , search algorithm , selection (genetic algorithm) , machine learning , pattern recognition (psychology) , algorithm , philosophy , linguistics
Big Data analysis has been viewed as the processing or mining of massive amounts of data used to retrieve information which is useful from large datasets. Among all the methods employed to deal with the analysis of Big Data, the selection of a feature is found extremely effective. A common approach which includes search making use of feature-based subsets which is relevant to the topic, tends to represent the dataset with its actual description. However, a search that makes use of such a subset is a combinatorial problem which is time-consuming. All commonly used meta-heuristic algorithms to facilitate feature choice. The Stochastic Diffusion Search (SDS) based algorithm has been a multi-agent global search algorithm based on agent interaction is simple to overcome combinatorial problems. The SDS will choose the feature subset for the task of classification. The Classification and Regression Tree (CART), the Naïve Bayes (NB), the Support Vector Machine (SVM) and the K-Nearest Neighbour (KNN) have been used to improve the performance. Results proved that the proposed method was able to achieve a better performance than existing techniques.