z-logo
open-access-imgOpen Access
The Impact of Transformed Features in Automating the Swahili Document Classification
Author(s) -
Thomas Tesha
Publication year - 2015
Publication title -
international journal of computer applications
Language(s) - English
Resource type - Journals
ISSN - 0975-8887
DOI - 10.5120/ijca2015906707
Subject(s) - swahili , computer science , information retrieval , natural language processing , data science , linguistics , philosophy
paper describes experimental results in an attempt to identify the Transformation techniques which can be adopted to improve features for the automation of classification of Swahili documents. This means improving classification rate by enhancing separability and accuracy. The experiment involved Relative Frequency (RF), Power transformation (PT) and Relative Frequency with Power transformation (RFPT). The Term weighting with TFIDF and the absolute features (AF) were also studied. The features' dimension reduction was done by using the statistical techniques of Principal Component Analysis. In learning algorithm, the Support vector machine for classification and the k-NN were used, and in evaluating the effect of features' performance with the classifiers the micro averaged f-measure were adopted. The extensive experimental results demonstrated that the RFPT features worked better with the Support Vector Machine classifiers unlike k-NN in improving the classification rate by enhancing document separability and accuracy in Automation of Swahili document classification.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom