Open Access
Text Categorization Techniques and Current Trends
Author(s) -
Amit Jain,
Aditya Goyal,
Vikrant Singh,
Anshul Tripathi,
Kandasamy Saravanakumar
Publication year - 2020
Publication title -
international journal of engineering and advanced technology
Language(s) - English
Resource type - Journals
ISSN - 2249-8958
DOI - 10.35940/ijeat.e9620.069520
Subject(s) - categorization , computer science , information retrieval , natural language processing , feature selection , artificial intelligence , meaning (existential) , sorting , selection (genetic algorithm) , feature (linguistics) , text mining , task (project management) , reading (process) , linguistics , psychology , philosophy , management , economics , psychotherapist , programming language
With the development of online data, text categorization has become one of the key procedures for taking care of and sorting out content information. Text categorization strategies are utilized to order reports, to discover fascinating data on the world wide web. Text Categorization is a task for categorizing information based on text and it has been important for effective analysis of textual data frameworks. There are systems which are designed to analyse and make distinctions between meaningful classes of information and text, such system is known as text classification systems. The above-mentioned system is widely accepted and has been used for the purpose of retrieval of information and natural language processing. The archives can be ordered in three different ways unsupervised, supervised and semi supervised techniques. Text categorization alludes to the procedure of dole out a classification or a few classes among predefined ones to each archive, naturally. For the given text data, these words that can be expressed in the correct meaning of a word in different documents are usually considered as good features. In the paper, we have used certain measures to ensure meaningful text categorization. One such method is through feature selection which is the solution proposed in this paper which does not change the physicality of the original features. We have taken into account all meaningful features to distinguish between different text categorization approaches and highlighted the evaluation metrics, advantages and limitations of each approach. We conclusively studied the working of several approaches and drew conclusion of best suited algorithm by performing practical evaluation. We are going to review different papers on the basis of different text categorization sections and a comparative and conclusive analysis is presented in this paper. This paper will present classification on various kinds of ways to deal and compare with text categorization.