z-logo
open-access-imgOpen Access
Sparse Coding of Neural Word Embeddings for Multilingual Sequence Labeling
Author(s) -
Gáabor Berend
Publication year - 2017
Publication title -
transactions of the association for computational linguistics
Language(s) - English
Resource type - Journals
ISSN - 2307-387X
DOI - 10.1162/tacl_a_00059
Subject(s) - computer science , sequence labeling , natural language processing , word (group theory) , artificial intelligence , generalization , neural coding , coding (social sciences) , language model , variety (cybernetics) , sequence (biology) , speech recognition , linguistics , mathematics , mathematical analysis , philosophy , statistics , management , biology , economics , genetics , task (project management)
In this paper we propose and carefully evaluate a sequence labeling framework which solely utilizes sparse indicator features derived from dense distributed word representations. The proposed model obtains (near) state-of-the art performance for both part-of-speech tagging and named entity recognition for a variety of languages. Our model relies only on a few thousand sparse coding-derived features, without applying any modification of the word representations employed for the different tasks. The proposed model has favorable generalization properties as it retains over 89.8% of its average POS tagging accuracy when trained at 1.2% of the total available training data, i.e. 150 sentences per language.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom