z-logo
Premium
Experiences building Globus Genomics: a next‐generation sequencing analysis service using Galaxy, Globus, and Amazon Web Services
Author(s) -
Madduri Ravi K.,
Sulakhe Dinanath,
Lacinski Lukasz,
Liu Bo,
Rodriguez Alex,
Chard Kyle,
Dave Utpal J.,
Foster Ian T.
Publication year - 2014
Publication title -
concurrency and computation: practice and experience
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.309
H-Index - 67
eISSN - 1532-0634
pISSN - 1532-0626
DOI - 10.1002/cpe.3274
Subject(s) - computer science , workflow , pipeline transport , scheduling (production processes) , software , automation , reuse , service (business) , distributed computing , operating system , database , engineering , mechanical engineering , operations management , economy , environmental engineering , waste management , economics
SUMMARY We describe Globus Genomics, a system that we have developed for rapid analysis of large quantities of next‐generation sequencing genomic data. This system achieves a high degree of end‐to‐end automation that encompasses every stage of data analysis including initial data retrieval from remote sequencing centers or storage (via the Globus file transfer system); specification, configuration, and reuse of multistep processing pipelines (via the Galaxy workflow system); creation of custom Amazon Machine Images and on‐demand resource acquisition via a specialized elastic provisioner (on Amazon EC2); and efficient scheduling of these pipelines over many processors (via the HTCondor scheduler). The system allows biomedical researchers to perform rapid analysis of large next‐generation sequencing datasets in a fully automated manner, without software installation or a need for any local computing infrastructure. We report performance and cost results for some representative workloads. Copyright © 2014 John Wiley & Sons, Ltd.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here