Unsupervised Part-Of-Speech Tagging with Anchor Hidden Markov Models | Zendy

Karl Stratos | Zendy; Michael Collins | Zendy; Daniel Hsu | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Unsupervised Part-Of-Speech Tagging with Anchor Hidden Markov Models

Author(s) -

Karl Stratos,

Michael Collins,

Daniel Hsu

Publication year - 2016

Publication title -

transactions of the association for computational linguistics

Language(s) - English

Resource type - Journals

ISSN - 2307-387X

DOI - 10.1162/tacl_a_00096

Subject(s) - hidden markov model , computer science , word (group theory) , cluster analysis , artificial intelligence , speech recognition , estimator , part of speech tagging , exploit , natural language processing , pattern recognition (psychology) , part of speech , mathematics , statistics , geometry , computer security

We tackle unsupervised part-of-speech (POS) tagging by learning hidden Markov models (HMMs) that are particularly well-suited for the problem. These HMMs, which we call anchor HMMs, assume that each tag is associated with at least one word that can have no other tag, which is a relatively benign condition for POS tagging (e.g., “the” is a word that appears only under the determiner tag). We exploit this assumption and extend the non-negative matrix factorization framework of Arora et al. (2013) to design a consistent estimator for anchor HMMs. In experiments, our algorithm is competitive with strong baselines such as the clustering method of Brown et al. (1992) and the log-linear model of Berg-Kirkpatrick et al. (2010). Furthermore, it produces an interpretable model in which hidden states are automatically lexicalized by words.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research