Crosslingual and Multilingual Construction of Syntax-Based Vector Space Models | Zendy

Jason Utt | Zendy; Sebastian Padó | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Crosslingual and Multilingual Construction of Syntax-Based Vector Space Models

Author(s) -

Jason Utt,

Sebastian Padó

Publication year - 2014

Publication title -

transactions of the association for computational linguistics

Language(s) - English

Resource type - Journals

ISSN - 2307-387X

DOI - 10.1162/tacl_a_00180

Subject(s) - computer science , natural language processing , syntax , artificial intelligence , parsing , lexicon , lexical analysis , semantics (computer science) , distributional semantics , semantic similarity , programming language

Syntax-based distributional models of lexical semantics provide a flexible and linguistically adequate representation of co-occurrence information. However, their construction requires large, accurately parsed corpora, which are unavailable for most languages. In this paper, we develop a number of methods to overcome this obstacle. We describe (a) a crosslingual approach that constructs a syntax-based model for a new language requiring only an English resource and a translation lexicon; and (b) multilingual approaches that combine crosslingual with monolingual information, subject to availability. We evaluate on two lexical semantic benchmarks in German and Croatian. We find that the models exhibit complementary profiles: crosslingual models yield higher accuracies while monolingual models provide better coverage. In addition, we show that simple multilingual models can successfully combine their strengths.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research