
Identifying valuable information from twitter during natural disasters
Author(s) -
Truong Brandon,
Caragea Cornelia,
Squicciarini Anna,
Tapia Andrea H.
Publication year - 2014
Publication title -
proceedings of the american society for information science and technology
Language(s) - English
Resource type - Journals
eISSN - 1550-8390
pISSN - 0044-7870
DOI - 10.1002/meet.2014.14505101162
Subject(s) - social media , natural disaster , set (abstract data type) , computer science , event (particle physics) , feature (linguistics) , context (archaeology) , naive bayes classifier , natural (archaeology) , order (exchange) , data set , information retrieval , artificial intelligence , data mining , data science , world wide web , geography , support vector machine , business , linguistics , archaeology , meteorology , philosophy , physics , finance , quantum mechanics , programming language
Social media is a vital source of information during any major event, especially natural disasters. However, with the exponential increase in volume of social media data, so comes the increase in conversational data that does not provide valuable information, especially in the context of disaster events, thus, diminishing peoples’ ability to find the information that they need in order to organize relief efforts, find help, and potentially save lives. This project focuses on the development of a Bayesian approach to the classification of tweets (posts on Twitter) during Hurricane Sandy in order to distinguish “informational” from “conversational” tweets. We designed an effective set of features and used them as input to Naïve Bayes classifiers. In comparison to a “bag of words” approach, the new feature set provides similar results in the classification of tweets. However, the designed feature set contains only 9 features compared with more than 3000 features for “bag of words.” When the feature set is combined with “bag of words”, accuracy achieves 85.2914%. If integrated into disaster‐related systems, our approach can serve as a boon to any person or organization seeking to extract useful information in the midst of a natural disaster.