Sparse Coding of Neural Word Embeddings for Multilingual Sequence Labeling | Zendy

Gáabor Berend | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Sparse Coding of Neural Word Embeddings for Multilingual Sequence Labeling

Author(s) -

Gáabor Berend

Publication year - 2017

Publication title -

transactions of the association for computational linguistics

Language(s) - English

Resource type - Journals

ISSN - 2307-387X

DOI - 10.1162/tacl_a_00059

Subject(s) - computer science , sequence labeling , natural language processing , word (group theory) , artificial intelligence , generalization , neural coding , coding (social sciences) , language model , variety (cybernetics) , sequence (biology) , speech recognition , linguistics , mathematics , mathematical analysis , philosophy , statistics , management , biology , economics , genetics , task (project management)

In this paper we propose and carefully evaluate a sequence labeling framework which solely utilizes sparse indicator features derived from dense distributed word representations. The proposed model obtains (near) state-of-the art performance for both part-of-speech tagging and named entity recognition for a variety of languages. Our model relies only on a few thousand sparse coding-derived features, without applying any modification of the word representations employed for the different tasks. The proposed model has favorable generalization properties as it retains over 89.8% of its average POS tagging accuracy when trained at 1.2% of the total available training data, i.e. 150 sentences per language.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research