
Statistical analysis of small twitter data collection to identify dengue outbreaks
Author(s) -
Carlos Euzebio,
Sidney Agy,
Claudio Boldorini,
Lucas Faria Porto,
José Renato Alcarás,
Alexandre Martinez,
Evandro Eduardo Seron Ruiz
Publication year - 2020
Language(s) - English
Resource type - Conference proceedings
DOI - 10.5753/kdmile.2020.11954
Subject(s) - dengue fever , social media , microblogging , computer science , data collection , categorization , data set , set (abstract data type) , data science , data mining , artificial intelligence , statistics , world wide web , medicine , mathematics , virology , programming language
This study presents an algorithmic strategy to analyze a small set of social network information to monitor the dengue disease. Previous studies have achieved similar results based on large datasets of Twitter microblogs. In this study, we successfully map dengue cases using a small data collection of tweets from a medium-size city. A set of modules were constructed to collect, categorize, and display dengue-related tweets. We compared the collected tweets with real data from confirmed dengue cases. We showed a significant correlation between the number of confirmed dengue cases and the number of dengue-related tweets, even considering such a small dataset. The results of this approach may be relevant in public health policies.