z-logo
open-access-imgOpen Access
Maximizing human effort for analyzing scientific images: A case study using digitized herbarium sheets
Author(s) -
Brenskelle Laura,
Guralnick Rob P.,
Denslow Michael,
Stucky Brian J.
Publication year - 2020
Publication title -
applications in plant sciences
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.64
H-Index - 23
ISSN - 2168-0450
DOI - 10.1002/aps3.11370
Subject(s) - herbarium , annotation , citizen science , digitization , set (abstract data type) , data science , information retrieval , quality (philosophy) , computer science , visualization , scalability , premise , biology , data mining , artificial intelligence , computer vision , database , ecology , philosophy , botany , epistemology , programming language , linguistics
Premise Digitization and imaging of herbarium specimens provides essential historical phenotypic and phenological information about plants. However, the full use of these resources requires high‐quality human annotations for downstream use. Here we provide guidance on the design and implementation of image annotation projects for botanical research. Methods and Results We used a novel gold‐standard data set to test the accuracy of human phenological annotations of herbarium specimen images in two settings: structured, in‐person sessions and an online, community‐science platform. We examined how different factors influenced annotation accuracy and found that botanical expertise, academic career level, and time spent on annotations had little effect on accuracy. Rather, key factors included traits and taxa being scored, the annotation setting, and the individual scorer. In‐person annotations were significantly more accurate than online annotations, but both generated relatively high‐quality outputs. Gathering multiple, independent annotations for each image improved overall accuracy. Conclusions Our results provide a best‐practices basis for using human effort to annotate images of plants. We show that scalable community science mechanisms can produce high‐quality data, but care must be taken to choose tractable taxa and phenophases and to provide informative training material.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here