
Primerjava običajnih in faktorskih modelov pri statističnem strojnem prevajanju iz angleščine v slovenščino z orodjem Moses
Author(s) -
Sašo Kuntarič,
Simon Krek,
Marko Robnik Šikonja
Publication year - 2018
Publication title -
slovenščina 2.0
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.165
H-Index - 1
ISSN - 2335-2736
DOI - 10.4312/slo2.0.2017.1.1-26
Subject(s) - physics
Machine translation is a field in computational linguistics that explores the use of software to translate text from one language to another. Factored statistical translation is an extension of statistical machine translation, where linguistic annotation is added on the word level. Words are turned into vectors in an attempt to improve translation quality. We describe the use of the open-source Moses system for factored statistical machine translation from English to Slovenian. We created several factored and non-factored language and translation models from a text corpus, containing IT-related texts. We translated two different IT-related documents. The first one was marketing-orientated with a complex structure, while the second one was technical with a simpler structure. We used two methods to compare the generated translations with two independent human translations and a translation, created by the Google Translate service. The first comparison method was the BLEU metrics and the second one were evaluations of human reviewers. The latter method expressed a subjective score, which is still very important in the machine translation field. Even though the results can’t be compared directly due to different metrics, the movement of the grades is well correlated for both texts. The only bigger difference can be seen while implementing factored models for translating the second text. In the conclusion we analysed the inter-evaluator coherence and the obtained results. We discovered that our models are more suitable for technical texts, and that factored models improve the translation of complex texts more