z-logo
open-access-imgOpen Access
The Language Demographics of Amazon Mechanical Turk
Author(s) -
Ellie Pavlick,
Matt Post,
Ann Irvine,
Dmitry Kachaev,
Chris Callison-Burch
Publication year - 2014
Publication title -
transactions of the association for computational linguistics
Language(s) - English
Resource type - Journals
ISSN - 2307-387X
DOI - 10.1162/tacl_a_00167
Subject(s) - crowdsourcing , computer science , demographics , natural language processing , crowd sourcing , scale (ratio) , machine translation , artificial intelligence , data science , world wide web , geography , demography , cartography , sociology
We present a large scale study of the languages spoken by bilingual workers on Mechanical Turk (MTurk). We establish a methodology for determining the language skills of anonymous crowd workers that is more robust than simple surveying. We validate workers’ self-reported language skill claims by measuring their ability to correctly translate words, and by geolocating workers to see if they reside in countries where the languages are likely to be spoken. Rather than posting a one-off survey, we posted paid tasks consisting of 1,000 assignments to translate a total of 10,000 words in each of 100 languages. Our study ran for several months, and was highly visible on the MTurk crowdsourcing platform, increasing the chances that bilingual workers would complete it. Our study was useful both to create bilingual dictionaries and to act as census of the bilingual speakers on MTurk. We use this data to recommend languages with the largest speaker populations as good candidates for other researchers who want to develop crowdsourced, multilingual technologies. To further demonstrate the value of creating data via crowdsourcing, we hire workers to create bilingual parallel corpora in six Indian languages, and use them to train statistical machine translation systems.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom