Differential Testing of Machine Translators based on Compositional Semantics | Zendy

Shuang Liu | Zendy; Shujie Dou | Zendy; Junjie Chen | Zendy; Zhirun Zhang | Zendy; Ye Lu | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Differential Testing of Machine Translators based on Compositional Semantics

Author(s) -

Shuang Liu,

Shujie Dou,

Junjie Chen,

Zhirun Zhang,

Ye Lu

Publication year - 2023

Publication title -

ieee transactions on software engineering

Language(s) - English

Resource type - Journals

eISSN - 1939-3520

pISSN - 0098-5589

DOI - 10.1109/tse.2023.3323969

Subject(s) - computing and processing

Powered by the advances of deep neural networks, machine translation software has achieved rapid progresses recently. Machine translators are widely adopted in people’s daily lives, e.g., for information consumption, medical consumption and online shopping. However, machine translators are far from robust, and may produce wrong translations, which could potentially cause misunderstandings or even serious consequences. It is thus critical to detect errors in machine translators, and provide informative feedback for developers. In this work, we adopt the differential testing method to test machine translators. In particular, we use mature commercial translators as reference machine translation engines. Based on the principle of compositionality, which specifies that the meaning of a complex expression is determined by the meanings of its constituent expressions and the syntactic rules used to combine them, we design the oracle which conducts similarity comparison guided by syntactic structure and semantic encoding. In particular, we employ the constituency parsing to obtain the part-whole structure relation between a sentence and one of its component. Then we compute the semantic similarity of each sentence part with pre-trained language model and expert knowledge. We implement our approach into a tool named DCS, conduct experiments on three popular machine translators, i.e., Google translate, Baidu translate and Microsoft Bing translate, and compare DCS with two state-of-the-art approaches, i.e., CIT and CAT. The experiment results show that DCS achieves 8.6% and 35.4% higher precision, respectively. Moreover, the errors reported by DCS have the lowest redundancy in terms of the duplicated error locations in the source sentence. DCS can be used in complement with existing approaches and achieve higher detection precision. It also shows comparable efficiency with state-of-the-art approaches.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore