z-logo
open-access-imgOpen Access
A Sense-Topic Model for Word Sense Induction with Unsupervised Data Enrichment
Author(s) -
Jing Wang,
Mohit Bansal,
Kevin Gimpel,
Brian D. Ziebart,
Clement Yu
Publication year - 2015
Publication title -
transactions of the association for computational linguistics
Language(s) - English
Resource type - Journals
ISSN - 2307-387X
DOI - 10.1162/tacl_a_00122
Subject(s) - computer science , semeval , word (group theory) , context (archaeology) , natural language processing , artificial intelligence , word sense disambiguation , task (project management) , unsupervised learning , linguistics , paleontology , philosophy , management , wordnet , economics , biology
Word sense induction (WSI) seeks to automatically discover the senses of a word in a corpus via unsupervised methods. We propose a sense-topic model for WSI, which treats sense and topic as two separate latent variables to be inferred jointly. Topics are informed by the entire document, while senses are informed by the local context surrounding the ambiguous word. We also discuss unsupervised ways of enriching the original corpus in order to improve model performance, including using neural word embeddings and external corpora to expand the context of each data instance. We demonstrate significant improvements over the previous state-of-the-art, achieving the best results reported to date on the SemEval-2013 WSI task.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom