The Integrated Resource for Reproducibility in Macromolecular Crystallography: Experiences of the first four years
Author(s) -
M. Grabowski,
M. Cymborowski,
Przemyslaw Porebski,
T. Osinski,
I.G. Shabalin,
David R. Cooper,
W. Minor
Publication year - 2019
Publication title -
structural dynamics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.415
H-Index - 29
ISSN - 2329-7778
DOI - 10.1063/1.5128672
Subject(s) - computer science , metadata , interoperability , pipeline (software) , resource (disambiguation) , data extraction , raw data , information retrieval , data science , data mining , world wide web , chemistry , medline , computer network , biochemistry , programming language
It has been increasingly recognized that preservation and public accessibility of primary experimental data are cornerstones necessary for the reproducibility of empirical sciences. In the field of molecular crystallography, many journals now recommend that authors of manuscripts presenting a new crystal structure should deposit their primary experimental data (X-ray diffraction images) to one of the dedicated resources created in recent years. Here, we describe our experiences developing the Integrated Resource for Reproducibility in Molecular Crystallography (IRRMC) and describe several examples of a crucial role that diffraction data can play in improving previously determined protein structures. In its first four years, several hundred crystallographers have deposited data from over 5200 diffraction experiments performed at over 60 different synchrotron beamlines or home sources all over the world. In addition to improving the resource and curating submitted data, we have been building a pipeline for extraction or, in some cases, reconstruction of the metadata necessary for seamless automated processing. Preliminary analysis indicates that about 95% of the archived data can be automatically reprocessed. A high rate of reprocessing success shows the feasibility of using the automated metadata extraction and automated processing as a validation step for the deposition of raw diffraction images. The IRRMC is guided by the Findable, Accessible, Interoperable, and Reusable data management principles.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom