DataDeps.jl: Repeatable Data Setup for Reproducible Data Science
Author(s) -
Lyndon White,
Roberto Togneri,
Wei Liu,
Mohammed Bennamoun
Publication year - 2019
Publication title -
journal of open research software
Language(s) - English
Resource type - Journals
ISSN - 2049-9647
DOI - 10.5334/jors.244
Subject(s) - computer science , scripting language , replicate , software , programming language , code (set theory) , confusion , repeatability , replication (statistics) , data mining , r package , function (biology) , software engineering , psychology , statistics , chemistry , mathematics , set (abstract data type) , chromatography , evolutionary biology , psychoanalysis , biology
We present DataDeps.jl: a julia package for the reproducible handling of static datasets to enhance the repeatability of scripts used in the data and computational sciences. It is used to automate the data setup part of running software which accompanies a paper to replicate a result. This step is commonly done manually, which expends time and allows for confusion. This functionality is also useful for other packages which require data to function (e.g. a trained machine learning based model). DataDeps.jl simplifies extending research software by automatically managing the dependencies and makes it easier to run another authoru0027s code, thus enhancing the reproducibility of data science research.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom