z-logo
open-access-imgOpen Access
EXPLORING TIMEOUT AS A PERFORMANCE AND AVAILABILITY FACTOR OF DISTRIBUTED REPLICATED DATABASE SYSTEMS
Author(s) -
Anatoliy Gorbenko,
Olga Tarasyuk
Publication year - 2020
Publication title -
radìoelektronnì ì komp'ûternì sistemi
Language(s) - English
Resource type - Journals
eISSN - 2663-2012
pISSN - 1814-4225
DOI - 10.32620/reks.2020.4.09
Subject(s) - timeout , workload , computer science , latency (audio) , distributed computing , consistency (knowledge bases) , fault tolerance , data consistency , high availability , database , computer network , operating system , telecommunications , artificial intelligence
A concept of distributed replicated data storages like Cassandra, HBase, MongoDB has been proposed to effectively manage the Big Data sets whose volume, velocity, and variability are difficult to deal with by using the traditional Relational Database Management Systems. Trade-offs between consistency, availability, partition tolerance, and latency are intrinsic to such systems. Although relations between these properties have been previously identified by the well-known CAP theorem in qualitative terms, it is still necessary to quantify how different consistency and timeout settings affect system latency. The paper reports results of Cassandra's performance evaluation using the YCSB benchmark and experimentally demonstrates how to read latency depends on the consistency settings and the current database workload. These results clearly show that stronger data consistency increases system latency, which is in line with the qualitative implication of the CAP theorem. Moreover, Cassandra latency and its variation considerably depend on the system workload. The distributed nature of such a system does not always guarantee that the client receives a response from the database within a finite time. If this happens, it causes so-called timing failures when the response is received too late or is not received at all. In the paper, we also consider the role of the application timeout which is the fundamental part of all distributed fault tolerance mechanisms working over the Internet and used as the main error detection mechanism here. The role of the application timeout as the main determinant in the interplay between system availability and responsiveness is also examined in the paper. It is quantitatively shown how different timeout settings could affect system availability and the average servicing and waiting time. Although many modern distributed systems including Cassandra use static timeouts it was shown that the most promising approach is to set timeouts dynamically at run time to balance performance, availability and improve the efficiency of the fault-tolerance mechanisms.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here