SPMM: A Soft Piecewise Mapping Model for Bilingual Lexicon Induction | Zendy

Yan Fan | Zendy; Chengyu Wang | Zendy; Boxing Chen | Zendy; Zhongkai Hu | Zendy; Xiaofeng He | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

SPMM: A Soft Piecewise Mapping Model for Bilingual Lexicon Induction

Author(s) -

Yan Fan,

Chengyu Wang,

Boxing Chen,

Zhongkai Hu,

Xiaofeng He

Publication year - 2019

Publication title -

society for industrial and applied mathematics ebooks

Language(s) - English

Resource type - Book series

DOI - 10.1137/1.9781611975673.28

Subject(s) - computer science , embedding , artificial intelligence , word embedding , natural language processing , lexicon , boosting (machine learning)

Bilingual Lexicon Induction (BLI) aims at inducing word translations in two distinct languages. The generated bilingual dictionaries via BLI are essential for cross-lingual NLP applications. Most existing methods assume that a mapping matrix can be learned to project the embedding of a word in the source language to that of a word in the target language which shares the same meaning. However, a single matrix may not be able to provide sufficiently large parameter space and to tailor to the semantics of words across different domains and topics due to the complicated nature of linguistic regularities. In this paper, we propose a Soft Piecewise Mapping Model (SPMM). It generates word alignments in two languages by learning multiple mapping matrices with orthogonal constraint. Each matrix encodes the embedding translation knowledge over a distribution of latent topics in the embedding spaces. Such learning problem can be formulated as an extended version of the Wahba’s problem, with a closed-form solution derived. To address the limited size of training data for low-resourced languages and emerging domains, an iterative boosting method based on SPMM is used to augment training dictionaries. Experiments conducted on both general and domain-specific corpora show that SPMM is effective and outperforms previous methods.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research