
An effective classification approach for big data with parallel generalized Hebbian algorithm
Author(s) -
Ahmed Hussein Ali,
Royida A. Ibrahem Alhayali,
Mostafa Abdulghafoor Mohammed,
Tole Sutikno
Publication year - 2021
Publication title -
bulletin of electrical engineering and informatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.251
H-Index - 12
ISSN - 2302-9285
DOI - 10.11591/eei.v10i6.3135
Subject(s) - big data , spark (programming language) , computer science , dimensionality reduction , hebbian theory , data mining , principal component analysis , artificial neural network , volume (thermodynamics) , process (computing) , artificial intelligence , clustering high dimensional data , unstructured data , machine learning , algorithm , cluster analysis , physics , quantum mechanics , programming language , operating system
Advancements in information technology is contributing to the excessive rate of big data generation recently. Big data refers to datasets that are huge in volume and consumes much time and space to process and transmit using the available resources. Big data also covers data with unstructured and structured formats. Many agencies are currently subscribing to research on big data analytics owing to the failure of the existing data processing techniques to handle the rate at which big data is generated. This paper presents an efficient classification and reduction technique for big data based on parallel generalized Hebbian algorithm (GHA) which is one of the commonly used principal component analysis (PCA) neural network (NN) learning algorithms. The new method proposed in this study was compared to the existing methods to demonstrate its capabilities in reducing the dimensionality of big data. The proposed method in this paper is implemented using Spark Radoop platform.