Improving Arabic Text Categorization using Normalization and Stemming Techniques | Zendy

M. Rouhia | Zendy; Mohamed Hamdy | Zendy; Mahmoud F. Hussein | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Improving Arabic Text Categorization using Normalization and Stemming Techniques

Author(s) -

M. Rouhia,

Mohamed Hamdy,

Mahmoud F. Hussein

Publication year - 2016

Publication title -

international journal of computer applications

Language(s) - English

Resource type - Journals

ISSN - 0975-8887

DOI - 10.5120/ijca2016908328

Subject(s) - computer science , normalization (sociology) , arabic , categorization , natural language processing , text categorization , artificial intelligence , information retrieval , linguistics , philosophy , sociology , anthropology

Categorization is a technique for assigning documents based on their contents to one or more pre-defined categories. Achieving highest categorization accuracy remains one of the major challenges and it is also time consuming. We proposed approach to tackle these challenges. The proposed approach uses Frequency Ratio Accumulation Method (FRAM) as a classifier. Its features are represented using bag of word technique and an improved Term Frequency (TF) technique is used in features selection. The proposed approach is tested with known datasets. The experiments are done without both of normalization and stemming, with one of them, and with both of them. The obtained results of proposed approach are generally improved compared to existing techniques.The performance attributes of proposed Arabic Text Categorization approach were considered: Accuracy, Recall, Precision and F-measure (F1). The averages of the obtained results are 97.50%, 97.50%, 97.51%, and 97.49% respectively using normalization. Keywordstext categorization, Frequency ratio accumulation method (FRAM), Bag-Of-Word (BOW), Features selection, Term and document frequency.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research