Adversarial Attacks on Crowdsourcing Quality Control
Author(s) -
Alessandro Checco,
Jo Bates,
Gianluca Demartini
Publication year - 2020
Publication title -
journal of artificial intelligence research
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.79
H-Index - 123
eISSN - 1943-5037
pISSN - 1076-9757
DOI - 10.1613/jair.1.11332
Subject(s) - crowdsourcing , computer science , quality (philosophy) , set (abstract data type) , process (computing) , benchmark (surveying) , control (management) , adversarial system , key (lock) , plug in , data science , task (project management) , artificial intelligence , computer security , machine learning , world wide web , engineering , philosophy , geodesy , epistemology , systems engineering , programming language , geography , operating system
Crowdsourcing is a popular methodology to collect manual labels at scale. Such labels are often used to train AI models and, thus, quality control is a key aspect in the process. One of the most popular quality assurance mechanisms in paid micro-task crowdsourcing is based on gold questions: the use of a small set of tasks for which the requester knows the correct answer and, thus, is able to directly assess crowdwork quality. In this paper, we show that such a mechanism is prone to an attack carried out by a group of colluding crowdworkers that is easy to implement and deploy: the inherent size limit of the gold set can be exploited by building an inferential system to detect which parts of the job are more likely to be gold questions. The described attack is robust to various forms of randomisation and programmatic generation of gold questions. We present the architecture of the proposed system, composed of a browser plug-in and an external server used to share information, and briefly introduce its potential evolution to a decentralised implementation. We implement and experimentally validate the gold question detection system, using real-world data from a popular crowdsourcing platform. Our experimental results show that crowdworkers using the proposed system spend more time on signalled gold questions but do not neglect the others thus achieving an increased overall work quality. Finally, we discuss the economic and sociological implications of this kind of attack.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom