
Getting Started Creating Data Dictionaries: How to Create a Shareable Data Set
Author(s) -
Erin Michelle Buchanan,
Sarah E Crain,
Ari L. Cunningham,
Hannah Johnson,
Hannah Stash,
Μαριέττα Παπαδάτου-Παστού,
Peder Mortvedt Isager,
Rickard Carlsson,
Balázs Aczél
Publication year - 2021
Publication title -
advances in methods and practices in psychological science
Language(s) - English
Resource type - Journals
eISSN - 2515-2467
pISSN - 2515-2459
DOI - 10.1177/2515245920928007
Subject(s) - computer science , information retrieval , metadata , data discovery , search engine indexing , workflow , world wide web , terminology , set (abstract data type) , documentation , process (computing) , data science , data sharing , data set , interoperability , data mapping , database , medicine , linguistics , philosophy , alternative medicine , pathology , artificial intelligence , programming language , operating system
As researchers embrace open and transparent data sharing, they will need to provide information about their data that effectively helps others understand their data sets’ contents. Without proper documentation, data stored in online repositories such as OSF will often be rendered unfindable and unreadable by other researchers and indexing search engines. Data dictionaries and codebooks provide a wealth of information about variables, data collection, and other important facets of a data set. This information, called metadata, provides key insights into how the data might be further used in research and facilitates search-engine indexing to reach a broader audience of interested parties. This Tutorial first explains terminology and standards relevant to data dictionaries and codebooks. Accompanying information on OSF presents a guided workflow of the entire process from source data (e.g., survey answers on Qualtrics) to an openly shared data set accompanied by a data dictionary or codebook that follows an agreed-upon standard. Finally, we discuss freely available Web applications to assist this process of ensuring that psychology data are findable, accessible, interoperable, and reusable.