Parallel Treebanking Spanish-Quechua
Author(s) -
Annette Rios,
Anne Göhring,
Martin Volk
Publication year - 2012
Publication title -
linguistic issues in language technology
Language(s) - English
Resource type - Journals
eISSN - 1945-3590
pISSN - 1945-3604
DOI - 10.33011/lilt.v7i.1285
Subject(s) - agglutinative language , treebank , computer science , german , natural language processing , lexicon , artificial intelligence , machine translation , linguistics , text segmentation , segmentation , annotation , parsing , philosophy
Parallel treebanking is greatly facilitated by automatic word alignment. We work on building a trilingual treebank for German, Spanish and Quechua. We ran different alignment experiments on parallel Spanish-Quechua texts, measured the alignment quality, and compared these results to the figures we obtained aligning a comparable corpus of Spanish-German texts. This preliminary work has shown us the best word segmentation to use for the agglutinative language Quechua with respect to alignment. We also acquired a first impression about how well Quechua can be aligned to Spanish, an important prerequisite for bilingual lexicon extraction, parallel treebanking or statistical machine translation.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom