z-logo
open-access-imgOpen Access
Towards Analyzing Computational Costs of Spark for SARS-CoV-2 Sequences Comparisons on a Commercial Cloud
Author(s) -
Alan L. Nunes,
Alba Cristina Magalhães Alves de Melo,
Cristina Boeres,
Daniel de Oliveira,
Lúcia Maria de A. Drummond
Publication year - 2021
Language(s) - English
Resource type - Conference proceedings
DOI - 10.5753/wscad.2021.18523
Subject(s) - spark (programming language) , overhead (engineering) , cloud computing , computer science , virtual machine , real time computing , operating system , programming language
In this paper, we developed a Spark application, named Diff Sequences Spark, which compares 540 SARS-CoV-2 sequences from South America in Amazon EC2 Cloud, generating as output the positions where the differences occur. We analyzed the performance of the proposed application on selected memory and storage optimized virtual machines (VMs) at on-demand and spot markets. The execution times and financial costs of the memory optimized VMs outperformed the storage optimized ones. Regarding the markets, Diff Sequences Spark reduced the average execution times and monetary costs when using spot VMs compared to their respective on-demand VMs, even in scenarios with several spot revocations, benefiting from the low overhead fault tolerance Spark framework.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here