Premium
Shape analysis for the automated identification of plants from images of leaves
Author(s) -
Hearn David J.
Publication year - 2009
Publication title -
taxon
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.819
H-Index - 81
eISSN - 1996-8175
pISSN - 0040-0262
DOI - 10.1002/tax.583021
Subject(s) - identification (biology) , pattern recognition (psychology) , plant identification , computer science , artificial intelligence , procrustes analysis , fourier transform , data mining , biological system , mathematics , biology , botany , mathematical analysis
Abstract Species identification is a necessary component of most studies of biological diversity, and computational approaches are beginning to automate it. In particular, leaves of plants provide taxon‐specific information that has successfully been applied to plant identification. Prior studies have not investigated the number of leaves or the resolution of the digitized leaf image required to represent a species’ shape. Moreover, the relationship between accuracy and the size of the leaf shape database, and methods to integrate automated approaches with more traditional dichotomous keys have yet to be explored. Here, I use a database of 2,420 leaves from 151 species to address these issues. Using distance metrics derived from Fourier and Procrustes analyses, it is found that a minimum of 10 leaves of each species, 100 margin points, and 10 Fourier harmonics are required to accurately represent leaf shape of a species. These results are used to assess the success of species identification from images of leaves: 72% for all 151 species. The tight relationship between database size and accuracy is then used in conjunction with results from probability theory to predict accuracy of species identification when dichotomous multiple‐entry keys and combined Fourier and Procrustes analysis are used together. Combining these two approaches to identification can greatly improve identification accuracy. Open‐source software is available to implement the automated distance‐based approach.