Open Access
Applying Support Vector Machines to POS tagging of the Ainu Language
Author(s) -
Karol Nowakowski,
Michał Ptaszyński,
Fumito Masui,
Yoshio Momouchi
Publication year - 2019
Language(s) - English
DOI - 10.33011/computel.v2i.449
Subject(s) - computer science , task (project management) , natural language processing , artificial intelligence , domain (mathematical analysis) , annotation , spoken language , part of speech tagging , part of speech , support vector machine , state (computer science) , speech recognition , programming language , mathematical analysis , mathematics , management , economics
We describe our attempt to apply a state-of-the-art sequential tagger – SVMTool – in the task of automatic part-of-speech annotation of the Ainu language, a critically endangered language isolate spoken by the native inhabitants of northern Japan. Our experiments indicated that it performs better than the custom system proposed in previous research (POST-AL), especially when applied to out-of-domain data. The biggest advantage of the model trained using SVMTool over the POST-AL tagger is its ability to guess part-of-speech tags for OoV words, with the accuracy of up to 63%.