Text Categorization based on Clustering Feature Selection | Zendy

Xiaofei Zhou | Zendy; Yue Hu | Zendy; Li Guo | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Text Categorization based on Clustering Feature Selection

Author(s) -

Xiaofei Zhou,

Yue Hu,

Li Guo

Publication year - 2014

Publication title -

procedia computer science

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.334

H-Index - 76

ISSN - 1877-0509

DOI - 10.1016/j.procs.2014.05.283

Subject(s) - computer science , categorization , feature selection , cluster analysis , text categorization , selection (genetic algorithm) , artificial intelligence , feature (linguistics) , pattern recognition (psychology) , data mining , information retrieval , philosophy , linguistics

In this paper, we discuss a text categorization method based on k-means clustering feature selection. K-means is classical algorithm for data clustering in text mining, but it is seldom used for feature selection. For text data, the words that can express correct semantic in a class are usually good features. We use k-means method to capture several cluster centroids for each class, and then choose the high frequency words in centroids as the text features for categorization. The words extracted by k-means not only can represent each class clustering well, but also own high quality for semantic expression. On three normal text databases, classifiers based on our feature selection method exhibit better performances than original classifiers for text categorization

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research