z-logo
Premium
Introduction to the special section on linguistically apt statistical methods
Author(s) -
Eisner Jason
Publication year - 2002
Publication title -
cognitive science
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.498
H-Index - 114
eISSN - 1551-6709
pISSN - 0364-0213
DOI - 10.1207/s15516709cog2603_1
Subject(s) - section (typography) , library science , citation , computer science , associate editor , operating system
In 1994—about 6 years after it was first infiltrated by statistical methods—the Association for Computational Linguistics hosted a workshop called “The Balancing Act: Combining Symbolic and Statistical Approaches to Language” (Klavans & Resnik, 1996) . The workshop argued that linguistics and statistics were not fundamentally at odds, even though the recent well-known statistical techniques for part-of-speech disambiguation (Church, 1988; DeRose, 1988)had, like their predecessors in speech recognition, flouted Chomsky’s (1957)warnings that Markov orn-gram models were inadequate to model language. The success of these Markovian techniques had merely established that empirically estimated probabilities could be rather effective even with an impoverished theory of linguistic structure. As an engineering matter, the workshop argued, it was wise to incorporate probabilities or other numbers into any linguistic approach. Several years later, it seems worth taking another snapshot from this perspective. It is fair to say that a greater proportion of hybrid approaches to language now are cleanly structured rather than cobbled together, and that the benefits to both sides of such approaches are better understood. The prevalent methodology is to design the form of one’s statistical model so that it is capable of expressing the kinds of linguistic generalizations that one cares about, and then to set the free parameters of this model so that its predicted behavior roughly matches the observed behavior of some training data. The reason that one augments a symbolic generative grammar with probabilities is to make it more robust to noise and ambiguity. 1 After all, statistics is the art of plausibly reconstructing the unknown, which is exactly what language comprehension and learning require. Conversely, one constrains a probability model with grammar to make it more robust to poverty of the stimulus. After all, from sparse data a statistician cannot hope to estimate a separate probability for every string of the language. All that is practical is to estimate a moderate set of parameters that encode high-level properties from which the behavior of the entire language emerges. Carrying out this program is not trivial in practice. Patterning a statistical model after a linguistic theory may require some rethinking of the theory, especially if the model is to be elegant and computationally tractable. And there is more than one way to do it: the first few tries at adding linguistic sophistication often hurt a system’s accuracy rather than helping it.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here