
Strategies for the Identification and Prevention of Survey Fraud: Data Analysis of a Web-Based Survey
Author(s) -
Mandi Pratt-Chapman,
Jenna Moses,
Hannah Arem
Publication year - 2021
Publication title -
jmir cancer
Language(s) - English
Resource type - Journals
ISSN - 2369-1999
DOI - 10.2196/30730
Subject(s) - captcha , identification (biology) , data quality , survey data collection , social media , incentive , computer science , web application , internet privacy , psychology , computer security , world wide web , statistics , business , mathematics , metric (unit) , botany , marketing , economics , biology , microeconomics
Background To assess the impact of COVID-19 on cancer survivors, we fielded a survey promoted via email and social media in winter 2020. Examination of the data showed suspicious patterns that warranted serious review. Objective The aim of this paper is to review the methods used to identify and prevent fraudulent survey responses. Methods As precautions, we included a Completely Automated Public Turing test to tell Computers and Humans Apart (CAPTCHA), a hidden question, and instructions for respondents to type a specific word. To identify likely fraudulent data, we defined a priori indicators that warranted elimination or suspicion. If a survey contained two or more suspicious indicators, the survey was eliminated. We examined differences between the retained and eliminated data sets. Results Of the total responses (N=1977), nearly three-fourths (n=1408) were dropped and one-fourth (n=569) were retained after data quality checking. Comparisons of the two data sets showed statistically significant differences across almost all demographic characteristics. Conclusions Numerous precautions beyond the inclusion of a CAPTCHA are needed when fielding web-based surveys, particularly if a financial incentive is offered.