Population estimation by random forest analysis using Social Sensors
Author(s) -
Hara Hiroki,
Yoshikatsu Fujita,
Kazuhiko Tsuda
Publication year - 2020
Publication title -
procedia computer science
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.334
H-Index - 76
ISSN - 1877-0509
DOI - 10.1016/j.procs.2020.09.229
Subject(s) - computer science , population , attendance , identification (biology) , estimation , social media , event (particle physics) , random forest , space (punctuation) , league , set (abstract data type) , data science , data mining , world wide web , artificial intelligence , engineering , sociology , economics , demography , botany , physics , systems engineering , quantum mechanics , astronomy , biology , programming language , economic growth , operating system
This paper aims to estimate the population in a specific space from the numbers of posted tweets and their senders, using Twitter’s real-time property and location information data. The population to be estimated was set to be the attendance at each game among the six baseball teams of the Japan Professional Baseball Pacific League held at the main stadium of each team. The relation between the attendance and Twitter data was analyzed, and random forest regression models using Twitter data were used to estimate the attendances. While there are many studies on event detection or location identification using Twitter data, no study has been reported on the estimation of the population in a specific space using “time information” and “location information” characteristic of Twitter data. Using Twitter data, which contains users’ messages, for estimating the population can be extended to various types of analyses, such as the analysis of feelings and opinions of the groups in the space.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom