Premium
Provenance in collection‐oriented scientific workflows
Author(s) -
Bowers Shawn,
McPhillips Timothy M.,
Ludäscher Bertram
Publication year - 2007
Publication title -
concurrency and computation: practice and experience
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.309
H-Index - 67
eISSN - 1532-0634
pISSN - 1532-0626
DOI - 10.1002/cpe.1226
Subject(s) - workflow , provenance , computer science , workflow engine , database , kepler , workflow technology , data collection , world wide web , information retrieval , data science , geology , petrology , stars , statistics , mathematics , computer vision
We describe a provenance model tailored to scientific workflows based on the collection‐oriented modeling and design paradigm. Our implementation within the Kepler scientific workflow system captures the dependencies of data and collection creation events on preexisting data and collections, and embeds these provenance records within the data stream. A provenance query engine operates on self‐contained workflow traces representing serializations of the output data stream for particular workflow runs. We demonstrate this approach in our response to the first provenance challenge. Copyright © 2007 John Wiley & Sons, Ltd.