Premium
Authorship analysis of English and Spanish tweets
Author(s) -
AlRashdan Mohammed N.,
Abdullah Malak,
AlAyyoub Mahmoud,
Jararweh Yaser
Publication year - 2020
Publication title -
proceedings of the association for information science and technology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.193
H-Index - 14
ISSN - 2373-9231
DOI - 10.1002/pra2.261
Subject(s) - computer science , misinformation , transfer of learning , artificial intelligence , profiling (computer programming) , german , social media , natural language processing , set (abstract data type) , task (project management) , fake news , deep learning , simple (philosophy) , world wide web , linguistics , internet privacy , computer security , engineering , systems engineering , programming language , philosophy , operating system , epistemology
With the countless advantages gained from the free, open, and ubiquitous nature of online social networks, they do come with their own set of problems and challenges. E.g., they represent a fertile ground for fake accounts and autonomous bots to spread fake news. Revealing whether some text content is written by a bot or a human would be of great value in the fight against the spreading of fake news and misinformation. In this paper, we address this problem using different Machine Learning (ML) techniques: conventional, Deep Learning (DL) based and Transfer Learning (TL) based. Using the dataset of the well‐known PAN 2019 Author Profiling Task, we show how relatively simple conventional ML methods can outperform DL and TL based ones for different languages (English and Spanish). In fact, our simplest model performs closely to the state‐of‐the‐art (SOTA) systems for the English language and even outperforms the SOTA systems for the Spanish language.