Streaming Social Media Data Analysis for Events Extraction and Warehousing using Hadoop and Storm: Drug Abuse Case Study
Author(s) -
Ferdaous Jenhani,
Mohamed Salah Gouider,
Lamjed Ben Saïd
Publication year - 2019
Publication title -
procedia computer science
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.334
H-Index - 76
ISSN - 1877-0509
DOI - 10.1016/j.procs.2019.09.316
Subject(s) - computer science , data warehouse , big data , oracle , social media , unstructured data , business intelligence , data science , knowledge extraction , data integration , process (computing) , task (project management) , data extraction , database , data mining , world wide web , software engineering , management , medline , political science , law , economics , operating system
In the age of big data, entreprises’ information systems are ingested with data generated from social media which raises the need to integrate it in their business intelligence process for better decision making. However, these new data, streaming, voluminous, unstructured and variant, bring existing data warehousing systems and integration tools to their knees which motivated us to conduct this research work. In this paper, we propose a large scale system based on distributed storage and parallel processing to succeed social media data warehousing. In fact, we combine Storm and Hadoop for structured events extraction from social media data and their integration in the data warehouse. We take the advantage of real time analysis of streaming data offered by Storm and batch processing of large volumes of data of Hadoop which facilitated streaming social media data analysis task. For conceptual representation, we propose a customized multidimensional model in which we add an intermediate table to connect the social media data warehouse with the enterprise data warehouse. We implement it using Oracle 12c and we fed it with events extracted from 1000 000 tweets using Pentaho data integration tool.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom