z-logo
open-access-imgOpen Access
Differential Testing of Machine Translators based on Compositional Semantics
Author(s) -
Shuang Liu,
Shujie Dou,
Junjie Chen,
Zhirun Zhang,
Ye Lu
Publication year - 2023
Publication title -
ieee transactions on software engineering
Language(s) - English
Resource type - Journals
eISSN - 1939-3520
pISSN - 0098-5589
DOI - 10.1109/tse.2023.3323969
Subject(s) - computing and processing
Powered by the advances of deep neural networks, machine translation software has achieved rapid progresses recently. Machine translators are widely adopted in people’s daily lives, e.g., for information consumption, medical consumption and online shopping. However, machine translators are far from robust, and may produce wrong translations, which could potentially cause misunderstandings or even serious consequences. It is thus critical to detect errors in machine translators, and provide informative feedback for developers. In this work, we adopt the differential testing method to test machine translators. In particular, we use mature commercial translators as reference machine translation engines. Based on the principle of compositionality, which specifies that the meaning of a complex expression is determined by the meanings of its constituent expressions and the syntactic rules used to combine them, we design the oracle which conducts similarity comparison guided by syntactic structure and semantic encoding. In particular, we employ the constituency parsing to obtain the part-whole structure relation between a sentence and one of its component. Then we compute the semantic similarity of each sentence part with pre-trained language model and expert knowledge. We implement our approach into a tool named DCS, conduct experiments on three popular machine translators, i.e., Google translate, Baidu translate and Microsoft Bing translate, and compare DCS with two state-of-the-art approaches, i.e., CIT and CAT. The experiment results show that DCS achieves 8.6% and 35.4% higher precision, respectively. Moreover, the errors reported by DCS have the lowest redundancy in terms of the duplicated error locations in the source sentence. DCS can be used in complement with existing approaches and achieve higher detection precision. It also shows comparable efficiency with state-of-the-art approaches.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here