How Data Drive Early Word Learning: A Cross-Linguistic Waiting Time Analysis
Author(s) -
Francis Mollica,
Steven T. Piantadosi
Publication year - 2017
Publication title -
open mind
Language(s) - English
Resource type - Journals
ISSN - 2470-2986
DOI - 10.1162/opmi_a_00006
Subject(s) - word (group theory) , language acquisition , word learning , computer science , artificial intelligence , statistical model , statistical learning , natural language processing , machine learning , linguistics , vocabulary , philosophy
The extent to which word learning is delayed by maturation as opposed to accumulating data is a longstanding question in language acquisition. Further, the precise way in which data influence learning on a large scale is unknown—experimental results reveal that children can rapidly learn words from single instances as well as by aggregating ambiguous information across multiple situations. We analyze Wordbank, a large cross-linguistic dataset of word acquisition norms, using a statistical waiting time model to quantify the role of data in early language learning, building off Hidaka (2013). We find that the model both fits and accurately predicts the shape of children’s growth curves. Further analyses of model parameters suggest a primarily data-driven account of early word learning. The parameters of the model directly characterize both the amount of data required and the rate at which informative data occurs. With high statistical certainty, words require on the order of ∼ 10 learning instances, which occur on average once every two months. Our method is extremely simple, statistically principled, and broadly applicable to modeling data-driven learning effects in development.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom