z-logo
open-access-imgOpen Access
Definitions of dataset in the scientific and technical literature
Author(s) -
Renear Allen H.,
Sacchi Simone,
Wickett Karen M.
Publication year - 2010
Publication title -
proceedings of the american society for information science and technology
Language(s) - English
Resource type - Journals
eISSN - 1550-8390
pISSN - 0044-7870
DOI - 10.1002/meet.14504701240
Subject(s) - computer science , documentation , normative , data science , key (lock) , data integration , scientific literature , information retrieval , data mining , epistemology , paleontology , philosophy , computer security , biology , programming language
The integration of heterogeneous data in varying formats and from diverse communities requires an improved understanding of the concept of a dataset , and of key related concepts, such as format, encoding, and version. Ultimately, a normative formal framework of such concepts will be needed to support the effective curation, integration, and use of shared multi‐disciplinary scientific data. To prepare for the development of this framework we reviewed the definitions of dataset found in technical documentation and the scientific literature. Four basic features can be identified as common to most definitions: grouping, content, relatedness , and purpose. In this summary of our results we describe each of these features, indicating the directions a more formal analysis might take.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom