Premium
Evaluating presence‐only species distribution models with discrimination accuracy is uninformative for many applications
Author(s) -
Warren Dan L.,
Matzke Nicholas J.,
Iglesias Teresa L.
Publication year - 2020
Publication title -
journal of biogeography
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.7
H-Index - 158
eISSN - 1365-2699
pISSN - 0305-0270
DOI - 10.1111/jbi.13705
Subject(s) - computer science , environmental niche modelling , field (mathematics) , machine learning , reliability (semiconductor) , econometrics , empirical research , experimental data , ecology , artificial intelligence , data mining , ecological niche , statistics , habitat , mathematics , biology , power (physics) , physics , quantum mechanics , pure mathematics
Abstract Aim Species distribution models are used across evolution, ecology, conservation and epidemiology to make critical decisions and study biological phenomena, often in cases where experimental approaches are intractable. Choices regarding optimal models, methods and data are typically made based on discrimination accuracy: a model's ability to predict subsets of species occurrence data that were withheld during model construction. However, empirical applications of these models often involve making biological inferences based on continuous estimates of relative habitat suitability as a function of environmental predictor variables. We term the reliability of these biological inferences ‘functional accuracy.’ We explore the link between discrimination accuracy and functional accuracy. Methods Using a simulation approach we investigate whether models that make good predictions of species distributions correctly infer the underlying relationship between environmental predictors and the suitability of habitat. Results We demonstrate that discrimination accuracy is only informative when models are simple and similar in structure to the true niche, or when data partitioning is geographically structured. However, the utility of discrimination accuracy for selecting models with high functional accuracy was low in all cases. Main conclusions These results suggest that many empirical studies and decisions are based on criteria that are unrelated to models’ usefulness for their intended purpose. We argue that empirical modelling studies need to place significantly more emphasis on biological insight into the plausibility of models, and that the current approach of maximizing discrimination accuracy at the expense of other considerations is detrimental to both the empirical and methodological literature in this active field. Finally, we argue that future development of the field must include an increased emphasis on simulation; methodological studies based on ability to predict withheld occurrence data may be largely uninformative about best practices for applications where interpretation of models relies on estimating ecological processes, and will unduly penalize more biologically informative modelling approaches.