
Parallel Treebanking Spanish-Quechua
Author(s) -
Annette Rios,
Anne Göhring,
Martin Volk
Publication year - 2012
Publication title -
linguistic issues in language technology
Language(s) - English
Resource type - Journals
eISSN - 1945-3590
pISSN - 1945-3604
DOI - 10.33011/lilt.v7i.1285
Subject(s) - agglutinative language , treebank , computer science , german , natural language processing , lexicon , artificial intelligence , machine translation , linguistics , text segmentation , segmentation , annotation , parsing , philosophy
Parallel treebanking is greatly facilitated by automatic word alignment. We work on building a trilingual treebank for German, Spanish and Quechua. We ran different alignment experiments on parallel Spanish-Quechua texts, measured the alignment quality, and compared these results to the figures we obtained aligning a comparable corpus of Spanish-German texts. This preliminary work has shown us the best word segmentation to use for the agglutinative language Quechua with respect to alignment. We also acquired a first impression about how well Quechua can be aligned to Spanish, an important prerequisite for bilingual lexicon extraction, parallel treebanking or statistical machine translation.