The Life Cycle of Structural Biology Data
Author(s) -
Chris Morris
Publication year - 2018
Publication title -
data science journal
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.358
H-Index - 21
ISSN - 1683-1470
DOI - 10.5334/dsj-2018-026
Subject(s) - clarity , class (philosophy) , data science , computer science , data sharing , data access , protein data bank , biological data , data system , data mining , biology , bioinformatics , database , artificial intelligence , protein structure , medicine , biochemistry , alternative medicine , pathology
Research data is acquired, interpreted, published, reused, and sometimes eventually discarded. Understanding this life cycle better will help the development of appropriate infrastructural services, ones which make it easier for researchers to preserve, share, and find data. Structural biology is a discipline within the life sciences, one that investigates the molecular basis of life by discovering and interpreting the shapes and motions of macromolecules. Structural biology has a strong tradition of data sharing, expressed by the founding of the Protein Data Bank (PDB) in 1971. The culture of structural biology is therefore already in line with the perspective that data from publicly funded research projects are public data. This review is based on the data life cycle as defined by the UK Data Archive. It identifies six stages: creating data, processing data, analysing data, preserving data, giving access to data, and re-using data. For clarity, ʻpreserving dataʼ and ʻgiving access to dataʼ are discussed together. A final stage to the life cycle, ʻdiscarding dataʼ, is also discussed. The review concludes with recommendations for future improvements to the IT infrastructure for structural biology.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom