z-logo
open-access-imgOpen Access
What You Can Scrape and What Is Right to Scrape: A Proposal for a Tool to Collect Public Facebook Data
Author(s) -
Moreno Mancosu,
Federico Vegetti
Publication year - 2020
Publication title -
social media + society
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.941
H-Index - 32
ISSN - 2056-3051
DOI - 10.1177/2056305120940703
Subject(s) - internet privacy , the internet , social media , computer science , identifier , world wide web , personally identifiable information , information sensitivity , plan (archaeology) , computer security , programming language , archaeology , history
In reaction to the Cambridge Analytica scandal, Facebook has restricted the access to its Application Programming Interface (API). This new policy has damaged the possibility for independent researchers to study relevant topics in political and social behavior. Yet, much of the public information that the researchers may be interested in is still available on Facebook, and can be still systematically collected through web scraping techniques. The goal of this article is twofold. First, we discuss some ethical and legal issues that researchers should consider as they plan their collection and possible publication of Facebook data. In particular, we discuss what kind of information can be ethically gathered about the users (public information), how published data should look like to comply with privacy regulations (like the GDPR), and what consequences violating Facebook’s terms of service may entail for the researcher. Second, we present a scraping routine for public Facebook posts, and discuss some technical adjustments that can be performed for the data to be ethically and legally acceptable. The code employs screen scraping to collect the list of reactions to a Facebook public post, and performs a one-way cryptographic hash function on the users’ identifiers to pseudonymize their personal information, while still keeping them traceable within the data. This article contributes to the debate around freedom of internet research and the ethical concerns that might arise by scraping data from the social web.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom