Value, but high costs in post-deposition data curation
Author(s) -
Petra ten Hoopen,
Clara Amid,
Pier Luigi Buttigieg,
Evangelos Pafilis,
Panos Bravakos,
Ana Cerdeño-Tárraga,
Richard Gibson,
Tim Kahlke,
Aglaia Legaki,
Kada Narayana Murthy,
Gabriella Papastefanou,
Emiliano Pereira-Flores,
Marc Rosselló,
Ana Luisa Toribio,
Guy Cochrane
Publication year - 2016
Publication title -
database
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.406
H-Index - 62
ISSN - 1758-0463
DOI - 10.1093/database/bav126
Subject(s) - discoverability , data curation , computer science , usability , annotation , information retrieval , value (mathematics) , data science , world wide web , linked data , context (archaeology) , process (computing) , semantic web , geography , artificial intelligence , human–computer interaction , machine learning , archaeology , operating system
Discoverability of sequence data in primary data archives is proportional to the richness of contextual information associated with the data. Here, we describe an exercise in the improvement of contextual information surrounding sample records associated with metagenomics sequence reads available in the European Nucleotide Archive. We outline the annotation process and summarize findings of this effort aimed at increasing usability of publicly available environmental data. Furthermore, we emphasize the benefits of such an exercise and detail its costs. We conclude that such a third party annotation approach is expensive and has value as an element of curation, but should form only part of a more sustainable submitter-driven approach. Database URL: http://www.ebi.ac.uk/ena.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom