
Assamese Text Classification using k Nearest Neighbor
Author(s) -
Moromi Gogoi*,
Shikhar Kumar Sarma
Publication year - 2019
Publication title -
international journal of recent technology and engineering
Language(s) - English
Resource type - Journals
ISSN - 2277-3878
DOI - 10.35940/ijrte.d8820.118419
Subject(s) - newspaper , assamese , computer science , information retrieval , k nearest neighbors algorithm , world wide web , artificial intelligence , advertising , linguistics , philosophy , business
Knowledge is the most powerful weapon of a society. And in today’s world it is just a click away from the mouse. There is abundance of knowledge and information in the form of newspaper , electronic newspaper ,articles, online journals, webpages , search results etc. And there is a wide range of news from all over the world. But then the choice of news varies from person to person. Some people may prefer sports news to amusement news and some people may prefer political news over sports news and likewise there can be a number of other choices. It completely relies on individual’s decision. Document Classification is the process of classifying a document into a number of predefined classes. In this paper we have done document classification of Assamese text using k-Nearest Neighbor. We have considered only four classes sports , politics , law and science. Our dataset consists of 200 documents collected from major Assamese newspaper . We have divided our data into 3:1. Majority of our datasets that is 75% data from datasets is used for training and the rest 25% of the datasets is considered for testing.