Open Access
k ‐Sparse extreme learning machine
Author(s) -
Raza N.,
Tahir M.,
Ali K.
Publication year - 2020
Publication title -
electronics letters
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.375
H-Index - 146
eISSN - 1350-911X
pISSN - 0013-5194
DOI - 10.1049/el.2020.1840
Subject(s) - overfitting , extreme learning machine , benchmark (surveying) , computer science , layer (electronics) , generalization , artificial intelligence , artificial neural network , linearity , principal component analysis , measure (data warehouse) , process (computing) , matrix (chemical analysis) , activation function , pattern recognition (psychology) , algorithm , machine learning , data mining , mathematics , mathematical analysis , chemistry , materials science , geodesy , organic chemistry , composite material , geography , operating system , physics , quantum mechanics
Extreme learning machine (ELM) is a single layer feed‐forward neural network with advantages of fast training and good generalization properties. However, when the size of the hidden layer is increased, both of these advantages are lost as the redundant information may cause overfitting. Traditional way to deal with the issue is to introduce regularisation which promote sparsity but in the output layer weight matrix. In this Letter, we proposed the use of sparsity inside the output of the hidden layer such as to use it as the only non‐linearity in the hidden layer. In the proposed formulation, we use linear activation function inside the hidden layer and keep k highest activity‐bearing neurons as a measure of sparsity. Using the principal component analysis, we project the resulting output layer matrix onto a low‐dimensional space in order to further remove redundant and irrelevant information, and speed up the training process. In order to verify the feasibility and effectiveness of the proposed method, we test and compare it with a number of ELM variants using benchmark datasets. Compared with these methods, our results demonstrate that the proposed method achieves better accuracy performance consistently across many different benchmark datasets.