Discriminative Lexical Semantic Segmentation with Gaps: Running the MWE Gamut | Zendy

Nathan Schneider | Zendy; Emily Danchik | Zendy; Chris Dyer | Zendy; Noah A. Smith | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Discriminative Lexical Semantic Segmentation with Gaps: Running the MWE Gamut

Author(s) -

Nathan Schneider,

Emily Danchik,

Chris Dyer,

Noah A. Smith

Publication year - 2014

Publication title -

transactions of the association for computational linguistics

Language(s) - English

Resource type - Journals

ISSN - 2307-387X

DOI - 10.1162/tacl_a_00176

Subject(s) - computer science , discriminative model , artificial intelligence , segmentation , natural language processing , crfs , sentence , sequence labeling , identification (biology) , representation (politics) , feature (linguistics) , conditional random field , task (project management) , chunking (psychology) , pattern recognition (psychology) , linguistics , philosophy , botany , management , politics , political science , law , economics , biology

We present a novel representation, evaluation measure, and supervised models for the task of identifying the multiword expressions (MWEs) in a sentence, resulting in a lexical semantic segmentation. Our approach generalizes a standard chunking representation to encode MWEs containing gaps, thereby enabling efficient sequence tagging algorithms for feature-rich discriminative models. Experiments on a new dataset of English web text offer the first linguistically-driven evaluation of MWE identification with truly heterogeneous expression types. Our statistical sequence model greatly outperforms a lookup-based segmentation procedure, achieving nearly 60% F1 for MWE identification.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research