Premium
The Paleoenvironmental Standard Terms (PaST) Thesaurus: Standardizing Heterogeneous Variables in Paleoscience
Author(s) -
Morrill Carrie,
Thrasher Bridget,
Lockshin Samuel N.,
Gille Edward P.,
McNeill Shelley,
Shepherd Ethan,
Gross Wendy S.,
Bauer Bruce A.
Publication year - 2021
Publication title -
paleoceanography and paleoclimatology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.927
H-Index - 127
eISSN - 2572-4525
pISSN - 2572-4517
DOI - 10.1029/2020pa004193
Subject(s) - simple knowledge organization system , computer science , information retrieval , thesaurus , interoperability , data science , world wide web , natural language processing , semantic web , semantic web stack , semantic analytics
Paleoscience data are extremely heterogeneous; hundreds of different types of measurements and reconstructions are routinely made by scientists on a variety of types of physical samples. This heterogeneity is one of the biggest barriers to finding paleoclimatic records, to building large‐scale data products, and to the use of paleoscience data beyond the community of specialists. Here, we document the Paleoenvironmental Standard Terms (PaST) thesaurus, the first authoritative vocabulary of standardized variable names for paleoclimatic and paleoenvironmental data developed in a formal knowledge organization structure. This structure is designed to improve data set discovery, support automated processing of data, and provide connectivity to other vocabularies. PaST is now used operationally at the World Data Service for Paleoclimatology (WDS‐Paleo), one of the largest repositories of paleoscience information. Terms from the PaST thesaurus standardize a broad array of paleoenvironmental and paleoclimatic measured and inferred variables, providing enough detail for accurate and precise data discovery and thereby promoting data reuse. We describe the main design decisions and features of the thesaurus, the governance structure for ongoing maintenance, and WDS‐Paleo services that now employ PaST. These services include an advanced search by variable name, an interface for thesaurus navigation, and a machine‐readable representation in the Simple Knowledge Organization System (SKOS) standard. This overview is designed for developers of thesauri, data contributors, and users of the WDS‐Paleo, and serves as a building block for future efforts within the broader paleoscience community to improve how data are described for long‐term findability, accessibility, interoperability, and reusability.