Empirical Evaluation of Big Data Analytics using Design of Experiment: Case Studies on Telecommunication Data
Author(s) -
Samneet Singh,
Yan Liu,
Wayne Ding,
Zheng Li
Publication year - 2016
Publication title -
services transactions on big data
Language(s) - English
Resource type - Journals
eISSN - 2326-442X
pISSN - 2326-4411
DOI - 10.29268/stbd.2016.3.2.1
Subject(s) - big data , telecommunications , computer science , empirical research , analytics , data science , data mining , statistics , mathematics
Data analytics involves the process of data collection, data analysis, and report generation. Data mining workflow tools usually orchestrate this process. The data analysis step in this process further consists a series of machine learning algorithms. There exists a variety of data mining tools and machine learning algorithms. Each tool or algorithm has its own set of features that become factors to affect both functional and nonfunctional attributes of the system of data analytics. Given domain-specific requirements of data analytics, understanding the effects of these factors and their combinations provide a guideline of selecting workflow tools and machine learning algorithms. In this paper, we develop an empirical evaluation method based on the principle of Design of Experiment. We apply this method to evaluate data mining tools and machine learning algorithms towards building big data analytics for telecommunication monitoring data. Two case studies are conducted to provide insights of relations between the requirements of data analytics and the choice of a tool or algorithm in the context of data analysis workflows. The demonstration also shows that our evaluation method can facilitate the replication of this evaluation study, and can conveniently be expanded for evaluating other tools and algorithms.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom