Premium
Unraveling scientists' multiple data roles dealing with genomic data
Author(s) -
Huang Hong
Publication year - 2018
Publication title -
proceedings of the association for information science and technology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.193
H-Index - 14
ISSN - 2373-9231
DOI - 10.1002/pra2.2018.14505501135
Subject(s) - data curation , genomics , raw data , data science , data management , computer science , genome browser , resource (disambiguation) , reuse , big data , genome , research data , computational biology , biology , database , data mining , genetics , gene , computer network , ecology , programming language
This study contributes to a better understanding of 135 genomics scientists' perception about their data roles in genome curation work. Our study was guided by previously identified sixteen data role items (Salomone et al., 2011), and intended to define a model of groupings for data roles in genome curation. Analysis of the results revealed that genomics scientists have specific sets of data roles in the genome curation work. These data role items were reduced to four factor constructs. Both data producer and data custodian are ranked the highest, followed by data manager, and data consumer. The findings indicate that genomics scientists are primarily dealing with reusing and wrangling data (e.g., cleaning up the raw data across platforms and databases). The constructs defined by this study advance the understanding of data roles and their relationships in genomic data curation. In addition, the resulting data role model can serve as a valuable resource to genome scientists for assigning data curation tasks, training and data management policies development.