Premium
Schrödinger's phenotypes: Herbarium specimens show two‐dimensional images are both good and (not so) bad sources of morphological data
Author(s) -
Borges Leonardo M.,
Reis Victor Candido,
Izbicki Rafael
Publication year - 2020
Publication title -
methods in ecology and evolution
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.425
H-Index - 105
ISSN - 2041-210X
DOI - 10.1111/2041-210x.13450
Subject(s) - digitization , herbarium , multivariate statistics , computer science , scale (ratio) , artificial intelligence , digital image , image (mathematics) , pattern recognition (psychology) , data mining , computer vision , image processing , cartography , biology , geography , machine learning , ecology
Abstract Museum specimens are the main source of information on organisms' morphological features. Although access to this information was commonly limited to researchers able to visit collections, it is now becoming freely available thanks to the digitization of museum specimens. With these images, we will be able to collectively build large‐scale morphological datasets, but these will only be useful if the limits to this approach are well‐known. To establish these limits, we used two‐dimensional images of plant specimens to test the precision and accuracy of image‐based data and analyses. To test measurement precision and accuracy, we compared leaf measurements taken from specimens and images of the same specimens. Then, we used legacy morphometric datasets to establish differences in the quality of datasets and multivariate analyses between specimens and images. To do so, we compared the multivariate space based on original legacy data to spaces built with datasets simulating image‐based data. We found that trait measurements made from images are as precise as those obtained directly from specimens, but as traits diminish in size, the accuracy drops as well. This decrease in accuracy, however, has a very low impact on dataset and analysis quality. The main problem with image‐based datasets comes from missing observations due to image resolution or organ overlapping. Missing data lowers the accuracy of datasets and multivariate analyses. Although the effect is not strong, this decrease in accuracy suggests caution is needed when designing morphological research that will rely on digitized specimens. As highlighted by images of plant specimens, 2D images are reliable measurement sources, even though resolution issues lower accuracy for small traits. At the same time, the impossibility of observing particular traits affects the quality of image‐based datasets and, thus, of derived analyses. Despite these issues, gathering phenotypic data from two‐dimensional images is valid and may support large‐scale studies on the morphology and evolution of a wide diversity of organisms.