Premium
Optimizing data stream processing for large‐scale applications
Author(s) -
Cappellari Paolo,
Roantree Mark,
Chun Soon Ae
Publication year - 2018
Publication title -
software: practice and experience
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.437
H-Index - 70
eISSN - 1097-024X
pISSN - 0038-0644
DOI - 10.1002/spe.2596
Subject(s) - computer science , stream processing , scalability , data stream mining , analytics , data stream , distributed computing , streaming data , complex event processing , data analysis , event (particle physics) , data mining , data science , database , process (computing) , programming language , telecommunications , physics , quantum mechanics
Summary Stream processing systems are designed to analyze data arriving in real time and using continuous queries and respond when a specific event or sequence of events are detected. An important aspect of these systems is Streaming Analytics, which facilitates statistical calculations on continuous data within the stream. These systems must be designed to handle high volumes of data, be scalable, and accommodate a multitude of long‐lived concurrently running analytics. The challenges involved in the development of stream processing include on‐the‐fly transformation of data streams to match the query needs of users and the ability to model stream transformations to detect overlaps and possibilities for optimizations and to specify a methodology to deliver optimizations. In particular, this work focuses on exposing data stream application internals in order to detect reusable parts and then consolidate applications to optimize computational resource usage. The Streaming Data Analytics Model presented in this paper adopts a declarative approach that enables processing and manipulation of data streams in a simple manner while facilitating powerful optimizations necessary for managing high volumes of streaming data in real time. An evaluation is provided to demonstrate in both theoretical and quantitative aspects the high performance offered by our approach.