z-logo
open-access-imgOpen Access
Translating Fieldwork into Datasets: The Development of a Corpus for the Quantitative Investigation of Grammatical Phenomena in Eibela
Author(s) -
Grant Aiton
Publication year - 2021
Language(s) - English
DOI - 10.33011/computel.v2i.973
Subject(s) - computer science , annotation , natural language processing , realization (probability) , python (programming language) , artificial intelligence , linguistics , xml , argument (complex analysis) , process (computing) , programming language , world wide web , philosophy , statistics , biochemistry , chemistry , mathematics
This extended abstract details the process of constructing an annotated XML corpus suitable for quantitative analysis of morphosyntactic and phonetic phenomena in the Eibela language of Papua New Guinea. Preliminary results will also be included, which investigate the semantic, phonetic, and discourse correlates of argument realization. The goal of this paper is to illustrate how legacy materials can be enriched and investigated using computational methodologies including forced alignment of phonetic segments using bulk processing of data in Python and R, the Montreal Forced Aligner (MFA), and morphosyntactic annotation developed as part of the Multilingual Corpus of Annotated Spoken Texts (Multi-CAST).

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here