Premium
An adaptive parallel execution strategy for cloud‐based scientific workflows
Author(s) -
Oliveira Daniel,
Ogasawara Eduardo,
Ocaña Kary,
Baião Fernanda,
Mattoso Marta
Publication year - 2011
Publication title -
concurrency and computation: practice and experience
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.309
H-Index - 67
eISSN - 1532-0634
pISSN - 1532-0626
DOI - 10.1002/cpe.1880
Subject(s) - computer science , workflow , cloud computing , distributed computing , scalability , workflow management system , workflow technology , scripting language , workflow engine , software engineering , data science , database , operating system
SUMMARY Many of the existing large‐scale scientific experiments modeled as scientific workflows are compute‐intensive. Some scientific workflow management systems already explore parallel techniques, such as parameter sweep and data fragmentation, to improve performance. In those systems, computing resources are used to accomplish many computational tasks in high performance environments, such as multiprocessor machines or clusters. Meanwhile, cloud computing provides scalable and elastic resources that can be instantiated on demand during the course of a scientific experiment, without requiring its users to acquire expensive infrastructure or to configure many pieces of software. In fact, because of these advantages some scientists have already adopted the cloud model in their scientific experiments. However, this model also raises many challenges. When scientists are executing scientific workflows that require parallelism, it is hard to decide a priori the amount of resources to use and how long they will be needed because the allocation of these resources is elastic and based on demand. In addition, scientists have to manage new aspects such as initialization of virtual machines and impact of data staging. SciCumulus is a middleware that manages the parallel execution of scientific workflows in cloud environments. In this paper, we introduce an adaptive approach for executing parallel scientific workflows in the cloud. This approach adapts itself according to the availability of resources during workflow execution. It checks the available computational power and dynamically tunes the workflow activity size to achieve better performance. Experimental evaluation showed the benefits of parallelizing scientific workflows using the adaptive approach of SciCumulus, which presented an increase of performance up to 47.1%. Copyright © 2011 John Wiley & Sons, Ltd.