Large–scale digitization of herbarium specimens: Development and usage of an automated, high–throughput conveyor system | Zendy

Sweeney Patrick W. | Zendy; Starly Binil | Zendy; Morris Paul J. | Zendy; Xu Yiming | Zendy; Jones Aimee | Zendy; Radhakrishnan Sridhar | Zendy; Grassa Christopher J. | Zendy; Davis Charles C. | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Premium

Large–scale digitization of herbarium specimens: Development and usage of an automated, high–throughput conveyor system

Author(s) -

Sweeney Patrick W.,

Starly Binil,

Morris Paul J.,

Xu Yiming,

Jones Aimee,

Radhakrishnan Sridhar,

Grassa Christopher J.,

Davis Charles C.

Publication year - 2018

Publication title -

taxon

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.819

H-Index - 81

eISSN - 1996-8175

pISSN - 0040-0262

DOI - 10.12705/671.10

Subject(s) - digitization , herbarium , workflow , container (type theory) , computer science , metadata , throughput , task (project management) , database , information retrieval , world wide web , engineering , operating system , ecology , telecommunications , biology , mechanical engineering , systems engineering , wireless

The billions of specimens housed in natural science collections provide a tremendous source of under–utilized data that are useful for scientific research, conservation, commerce, and education. Digitization and mobilization of specimen data and images promises to greatly accelerate their utilization. While digitization of natural science collection specimens has been occurring for decades, the vast majority of specimens remain un–digitized. If the digitization task is to be completed in the near future, innovative, high–throughput approaches are needed. To create a dataset for the study of global change in New England, we designed and implemented an industrial–scale, conveyor–based digitization workflow for herbarium specimen sheets. The workflow is a variation of an object–to–image–to–data workflow that prioritizes imaging and the capture of storage container–level data. The workflow utilizes a novel conveyor system developed specifically for the task of imaging flattened herbarium specimens. Using our workflow, we imaged and transcribed specimen–level data for almost 350,000 specimens over a 131–week period; an additional 56 weeks was required for storage container–level data capture. Our project has demonstrated that it is possible to capture both an image of a specimen and a core database record in 35 seconds per herbarium sheet (for intervals between images of 30 minutes or less) plus some additional overhead for container–level data capture. This rate was in line with the pre–project expectations for our approach. Our throughput rates are comparable with some other similar, high–throughput approaches focused on digitizing herbarium sheets and is as much as three times faster than rates achieved with more conventional non–automated approaches used during the project. We report on challenges encountered during development and use of our system and discuss ways in which our workflow could be improved. The conveyor apparatus software, database schema, configuration files, hardware list, and conveyor schematics are available for download on GitHub.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here

Accelerating Research