z-logo
open-access-imgOpen Access
Feature Selection us ing Normalized Weight Method f or Tamil Text Classification
Author(s) -
N. Rajkumar,
T. S. Subashini,
K. Rajan,
V. Ramalingam
Publication year - 2020
Publication title -
international journal of recent technology and engineering
Language(s) - English
Resource type - Journals
ISSN - 2277-3878
DOI - 10.35940/ijrte.f9068.059120
Subject(s) - tamil , computer science , automatic summarization , artificial intelligence , feature selection , document classification , natural language processing , information retrieval , categorization , selection (genetic algorithm) , tf–idf , weighting , term (time) , linguistics , medicine , philosophy , physics , quantum mechanics , radiology
The Feature Selection process simplify the Tamil text classification work at present we are in the information age, in this period all the applications has great growth in the domain of World Wide Web, so regional language like Tamil materials such as web pages, e-mails, e-books, and digital data has grown enormously so the retrieval of this Tamil digital document is more wanted by Tamil Document searcher. For quick retrieval of needed Tamil digitized documents among the millions of Tamil web documents, these documents should be classified by content according to their classes. The Tamil Text classification is a background work for many Tamil NLP applications such as query response, information extraction, information summarization, etc. the implementation of text categorization is very important in the information retrieval field. The text categorization assigns a document an appropriate category from a predefined group of categories. Tamil Text Classification classifies the documents based on Tamil text in a Document. Tamil language words are very rich in morphology and hence Tamil language consists of very large set of word forms. So it is important to reduce the features of Tamil text. This paper discusses about Feature selection Using Normalized weight from the huge set of key words from the preprocessed corpus. The Feature selection done by Term Weighting (TF*IDF) normalized method is reducing the size of the key word list which is very useful for training and testing Tamil text classification algorithms.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here