Firefly Algorithm based Feature Selection for Arabic Text Classification
Author(s) -
Souad Larabi-Marie-Sainte,
Nada Alalyani
Publication year - 2018
Publication title -
journal of king saud university - computer and information sciences
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.617
H-Index - 33
eISSN - 2213-1248
pISSN - 1319-1578
DOI - 10.1016/j.jksuci.2018.06.004
Subject(s) - computer science , feature selection , artificial intelligence , classifier (uml) , support vector machine , firefly algorithm , pattern recognition (psychology) , feature vector , selection (genetic algorithm) , feature (linguistics) , data mining , vector space model , machine learning , natural language processing , particle swarm optimization , linguistics , philosophy
Due to the large number of documents available in the internet, emails and digital libraries, document classification is becoming a crucial task extremely required. It is commonly achieved after performing feature selection that consists of selecting appropriate features to enhance the classification accuracy. Most of feature selection based text classification methods rely on building a term-frequency inverse-document frequency feature vector which is not usually efficient. In addition, numerous document classification studies are focused on English language. This paper deals with Arabic Text Classification which is not intensively studied due to the complexity of Arabic language. A new firefly algorithm based feature selection method is proposed. This algorithm has been successfully applied in different combinatorial problems. However, it has not been involved in feature selection concept to deal with Arabic Text Classification. To validate this technique, Support Vector Machine classifier is used as well as three evaluation measures including precision, recall and F-measure. Furthermore, experiments on OSAC real dataset along with a comparison with the state-of-the-art methods are performed. The proposed method achieves a precision value equals to 0.994. The results confirm the efficiency of the proposed feature selection method in improving Arabic Text Classification accuracy.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom