Rapid development of cloud-native intelligent data pipelines for scientific data streams using the HASTE Toolkit | Zendy

Ben Blamey | Zendy; Salman Toor | Zendy; Martin Dahlö | Zendy; Håkan Wieslander | Zendy; Philip J. Harrison | Zendy; IdaMaria Sintorn | Zendy; Alan Sabirsh | Zendy; Carolina Wählby | Zendy; Ola Spjuth | Zendy; Andreas Hellander | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Rapid development of cloud-native intelligent data pipelines for scientific data streams using the HASTE Toolkit

Author(s) -

Ben Blamey,

Salman Toor,

Martin Dahlö,

Håkan Wieslander,

Philip J. Harrison,

IdaMaria Sintorn,

Alan Sabirsh,

Carolina Wählby,

Ola Spjuth,

Andreas Hellander

Publication year - 2021

Publication title -

gigascience

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 2.947

H-Index - 54

ISSN - 2047-217X

DOI - 10.1093/gigascience/giab018

Subject(s) - cloud computing , computer science , data science , streams , data stream mining , pipeline transport , data mining , engineering , operating system , environmental engineering

Large streamed datasets, characteristic of life science applications, are often resource-intensive to process, transport and store. We propose a pipeline model, a design pattern for scientific pipelines, where an incoming stream of scientific data is organized into a tiered or ordered "data hierarchy". We introduce the HASTE Toolkit, a proof-of-concept cloud-native software toolkit based on this pipeline model, to partition and prioritize data streams to optimize use of limited computing resources.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research