Open Access
Bootstrapping a Neural Morphological Analyzer for St. Lawrence Island Yupik from a Finite-State Transducer
Author(s) -
Lane Schwartz,
Emily Chen,
Benjamin Hunt,
Sylvia L.R. Schreiner
Publication year - 2019
Language(s) - English
DOI - 10.33011/computel.v1i.4277
Subject(s) - spectrum analyzer , computer science , bootstrapping (finance) , state (computer science) , set (abstract data type) , artificial neural network , artificial intelligence , algorithm , mathematics , telecommunications , programming language , econometrics
Morphological analysis is a critical enabling technology for polysynthetic languages. We present a neural morphological analyzer for case-inflected nouns in St. Lawrence Island Yupik, an endangered polysythetic language in the Inuit-Yupik language family, treating morphological analysis as a recurrent neural sequence-to-sequence task. By utilizing an existing finite-state morphological analyzer to create training data, we improve analysis coverage on attested Yupik word types from approximately 75% for the existing finite-state analyzer to 100% for the neural analyzer. At the same time, we achieve a substantially higher level of accuracy on a held-out testing set, from 78.9% accuracy for the finite-state analyzer to 92.2% accuracy for our neural analyzer.