z-logo
open-access-imgOpen Access
Unsupervised learning of natural languages
Author(s) -
Zach Solan,
D. Horn,
Eytan Ruppin,
Shimon Edelman
Publication year - 2005
Publication title -
proceedings of the national academy of sciences
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 5.011
H-Index - 771
eISSN - 1091-6490
pISSN - 0027-8424
DOI - 10.1073/pnas.0409746102
Subject(s) - computer science , grammar induction , natural language processing , artificial intelligence , syntax , rule based machine translation , natural language , generalization , stochastic context free grammar , context (archaeology) , context free grammar , unsupervised learning , information extraction , tree adjoining grammar , biology , mathematics , mathematical analysis , paleontology
We address the problem, fundamental to linguistics, bioinformatics, and certain other disciplines, of using corpora of raw symbolic sequential data to infer underlying rules that govern their production. Given a corpus of strings (such as text, transcribed speech, chromosome or protein sequence data, sheet music, etc.), our unsupervised algorithm recursively distills from it hierarchically structured patterns. The adios (automatic distillation of structure) algorithm relies on a statistical method for pattern extraction and on structured generalization, two processes that have been implicated in language acquisition. It has been evaluated on artificial context-free grammars with thousands of rules, on natural languages as diverse as English and Chinese, and on protein data correlating sequence with function. This unsupervised algorithm is capable of learning complex syntax, generating grammatical novel sentences, and proving useful in other fields that call for structure discovery from raw data, such as bioinformatics.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom