Premium
Feature‐based high‐availability mechanism for quantile tasks in real‐time data stream processing
Author(s) -
Ding Weilong,
Han Yanbo,
Wang Jing,
Zhao Zhuofeng
Publication year - 2014
Publication title -
software: practice and experience
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.437
H-Index - 70
eISSN - 1097-024X
pISSN - 0038-0644
DOI - 10.1002/spe.2244
Subject(s) - computer science , replica , latency (audio) , quantile , overhead (engineering) , feature (linguistics) , cloud computing , sliding window protocol , stream processing , real time computing , distributed computing , response time , focus (optics) , window (computing) , operating system , optics , art , telecommunications , linguistics , philosophy , physics , economics , visual arts , econometrics
SUMMARY Under distributed Cloud environment, the real‐time and continuous data stream makes the availability during processing essential but expensive. For aggregation tasks of data stream processing systems, traditional replica‐based high‐availability mechanisms require large overheads at run‐time and long recovery latency at fail‐time, because of specific nature of aggregations. In this paper, we focus on the typical quantile tasks and propose a feature‐based high‐availability mechanism to reduce related overhead and the latency. With the help of monitor module, quantile feature is maintained incrementally through histogram synopsis over time‐based sliding window, and the failed quantile tasks can be recovered precisely with high probability in an efficient way. The effectiveness has been analyzed theoretically, and meanwhile, the acceptable tradeoff between overheads and performance has been demonstrated by comprehensive experiments on both synthetic and real data. Copyright © 2013 John Wiley & Sons, Ltd.