Greater Early Disambiguating Information for Less-Probable Words: The Lexicon Is Shaped by Incremental Processing
Author(s) -
Adam King,
Andrew Wedel
Publication year - 2020
Publication title -
open mind
Language(s) - English
Resource type - Journals
ISSN - 2470-2986
DOI - 10.1162/opmi_a_00030
Subject(s) - lexicon , zipf's law , computer science , focus (optics) , recall , linguistics , word (group theory) , natural language processing , artificial intelligence , statistics , physics , philosophy , mathematics , optics
There has been much work over the last century on optimization of the lexicon for efficient communication, with a particular focus on the form of words as an evolving balance between production ease and communicative accuracy. Zipf's law of abbreviation, the cross-linguistic trend for less-probable words to be longer, represents some of the strongest evidence the lexicon is shaped by a pressure for communicative efficiency. However, the various sounds that make up words do not all contribute the same amount of disambiguating information to a listener. Rather, the information a sound contributes depends in part on what specific lexical competitors exist in the lexicon. In addition, because the speech stream is perceived incrementally, early sounds in a word contribute on average more information than later sounds. Using a dataset of diverse languages, we demonstrate that, above and beyond containing more sounds, less-probable words contain sounds that convey more disambiguating information overall. We show further that this pattern tends to be strongest at word-beginnings, where sounds can contribute the most information.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom