Improving Text Categorization by Multicriteria Feature Selection
Author(s) -
Son Doan,
Susumu Horiguchi
Publication year - 2005
Publication title -
journal of advanced computational intelligence and intelligent informatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.172
H-Index - 20
eISSN - 1343-0130
pISSN - 1883-8014
DOI - 10.20965/jaciii.2005.p0570
Subject(s) - computer science , feature selection , categorization , text categorization , artificial intelligence , selection (genetic algorithm) , ranking (information retrieval) , feature (linguistics) , benchmark (surveying) , machine learning , naive bayes classifier , data mining , natural language processing , information retrieval , pattern recognition (psychology) , support vector machine , philosophy , linguistics , geodesy , geography
Text categorization involves assigning a natural language document to one or more predefined classes. One of the most interesting issues is feature selection. We propose an approach using multicriteria ranking of eatures, a new procedure for feature selection, and apply these to text categorization. Experimental results dealing with Reuters-21578 and 20Newsgroups benchmark data and the naive Bayes algorithm show that our proposal outperforms conventional feature selection in text categorization performance.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom