Integrating Swarm Intelligence and Statistical Data for Feature Selection in Text Categorization
Author(s) -
M. Janaki Meena,
K. R. Chandran,
J. Mary Brinda
Publication year - 2010
Publication title -
international journal of computer applications
Language(s) - English
Resource type - Journals
ISSN - 0975-8887
DOI - 10.5120/248-405
Subject(s) - computer science , feature selection , categorization , text categorization , selection (genetic algorithm) , feature (linguistics) , artificial intelligence , natural language processing , machine learning , information retrieval , data mining , data science , linguistics , philosophy
Feature selection is the principal step in classification problems with attributes of high dimension. It may also be considered as a problem to determine the subset of terms in training corpus, which maximizes the classifier’s performance. Most of the machine learning algorithms has tainted performance in high dimensional feature space. In this paper, a novel feature selection method based on Ant Colony Optimization, a swarm intelligence algorithm is proposed. Ant Colony Optimization is a metaheuristic algorithm used to increase the ability of finding high quality solutions to NP-hard problems. The heuristic information required for the optimization process is obtained through a chi-square based statistical method, CHIR which results in fast convergence. Performance of the classifier with features selected by proposed method is compared to the feature selected by conventional chi-square and CHIR methods. It is found that the proposed algorithm identifies better feature set than the conventional methods.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom