The Operation Sequence Model—Combining N-Gram-Based and Phrase-Based Statistical Machine Translation | Zendy

Nadir Durrani | Zendy; Helmut Schmid | Zendy; Alexander Fraser | Zendy; Philipp Koehn | Zendy; Hinrich Schütze | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

The Operation Sequence Model—Combining N-Gram-Based and Phrase-Based Statistical Machine Translation

Author(s) -

Nadir Durrani,

Helmut Schmid,

Alexander Fraser,

Philipp Koehn,

Hinrich Schütze

Publication year - 2015

Publication title -

computational linguistics

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.314

H-Index - 98

eISSN - 1530-9312

pISSN - 0891-2017

DOI - 10.1162/coli_a_00218

Subject(s) - computer science , machine translation , phrase , n gram , natural language processing , example based machine translation , artificial intelligence , translation (biology) , evaluation of machine translation , sequence (biology) , rule based machine translation , spurious relationship , transfer based machine translation , speech recognition , language model , algorithm , machine translation software usability , machine learning , biochemistry , chemistry , genetics , messenger rna , biology , gene

In this article, we present a novel machine translation model, the Operation Sequence Model (OSM), which combines the benefits of phrase-based and N-gram-based statistical machine translation (SMT) and remedies their drawbacks. The model represents the translation process as a linear sequence of operations. The sequence includes not only translation operations but also reordering operations. As in N-gram-based SMT, the model is: (i) based on minimal translation units, (ii) takes both source and target information into account, (iii) does not make a phrasal independence assumption, and (iv) avoids the spurious phrasal segmentation problem. As in phrase-based SMT, the model (i) has the ability to memorize lexical reordering triggers, (ii) builds the search graph dynamically, and (iii) decodes with large translation units during search. The unique properties of the model are (i) its strong coupling of reordering and translation where translation and reordering decisions are conditioned on n previous translation and reordering decisions, and (ii) the ability to model local and long-range reorderings consistently. Using BLEU as a metric of translation accuracy, we found that our system performs significantly better than state-of-the-art phrase-based systems (Moses and Phrasal) and N-gram-based systems (Ncode) on standard translation tasks. We compare the reordering component of the OSM to the Moses lexical reordering model by integrating it into Moses. Our results show that OSM outperforms lexicalized reordering on all translation tasks. The translation quality is shown to be improved further by learning generalized representations with a POS-based OSM.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research