z-logo
open-access-imgOpen Access
Kannada morpheme segmentation using machine learning
Author(s) -
Sachi Angle,
Ashwath B Rao,
S. Muralikrishna
Publication year - 2018
Publication title -
international journal of engineering and technology
Language(s) - English
Resource type - Journals
ISSN - 2227-524X
DOI - 10.14419/ijet.v7i2.31.13395
Subject(s) - agglutinative language , morpheme , computer science , artificial intelligence , natural language processing , root (linguistics) , treebank , text segmentation , context (archaeology) , word (group theory) , segmentation , annotation , linguistics , paleontology , philosophy , biology
This paper addresses and targets morpheme segmentation of Kannada words using supervised classification. We have used manually annotated Kannada treebank corpus, which is recently developed by us. Kannada bears resemblance to other Dravidian languages in morphological structure. It is an agglutinative language, hence its words have complex morphological form with each word comprising of a root and an optional set of suffixes. These suffixes carry additional meaning, apart from the root word in a context. This paper discusses the extraction of morphemes of a word by using Support Vector Machines for Classification. Additional features representing the properties of the Kannada words were extracted and the different letters were classified into labels that result in the morphological segmentation of the word. Various  methods for evaluation were considered and an accuracy of 85.97% was achieved.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here