Premium
An experience report: porting the MG‐RAST rapid metagenomics analysis pipeline to the cloud
Author(s) -
Wilke Andreas,
Wilkening Jared,
Glass Elizabeth M.,
Desai Narayan L.,
Meyer Folker
Publication year - 2011
Publication title -
concurrency and computation: practice and experience
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.309
H-Index - 67
eISSN - 1532-0634
pISSN - 1532-0626
DOI - 10.1002/cpe.1799
Subject(s) - porting , cloud computing , computer science , pipeline (software) , software deployment , domain (mathematical analysis) , metagenomics , distributed computing , computation , data science , process (computing) , virtual machine , cluster (spacecraft) , scaling , software engineering , operating system , software , programming language , biology , mathematical analysis , biochemistry , mathematics , gene , geometry
SUMMARY Existing applications in computational biology typically favor a local cluster based integrated computational platform. We present a lessons learned type report for scaling up an existing metagenomics application that outgrew the available local cluster hardware. In our example, removing a number of assumptions linked to tight integration allowed to expand beyond one administrative domain, increase the number and type of machines available for the application, and also improved scaling properties of the application. The assumptions made in designing the computational client make it well suitable for deployment as a virtual machine inside a cloud. This paper discusses the decision process and describes the suitability of deploying various bioinformatics computations to distributed heterogeneous machines. Copyright © 2011 John Wiley & Sons, Ltd.