Open Access
Trade‐offs in covariate selection for species distribution models: a methodological comparison
Author(s) -
Brodie Stephanie J.,
Thorson James T.,
Carroll Gemma,
Hazen Elliott L.,
Bograd Steven,
Haltuch Melissa A.,
Holsman Kirstin K.,
Kotwicki Stan,
Samhouri Jameal F.,
WillisNorton Ellen,
Selden Rebecca L.
Publication year - 2020
Publication title -
ecography
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 2.973
H-Index - 128
eISSN - 1600-0587
pISSN - 0906-7590
DOI - 10.1111/ecog.04707
Subject(s) - covariate , generalized additive model , econometrics , species distribution , ecology , computer science , lasso (programming language) , statistics , mathematics , biology , machine learning , habitat , world wide web
Species distribution models (SDMs) are a common approach to describing species’ space‐use and spatially‐explicit abundance. With a myriad of model types, methods and parameterization options available, it is challenging to make informed decisions about how to build robust SDMs appropriate for a given purpose. One key component of SDM development is the appropriate parameterization of covariates, such as the inclusion of covariates that reflect underlying processes (e.g. abiotic and biotic covariates) and covariates that act as proxies for unobserved processes (e.g. space and time covariates). It is unclear how different SDMs apportion variance among a suite of covariates, and how parameterization decisions influence model accuracy and performance. To examine trade‐offs in covariation parameterization in SDMs, we explore the attribution of spatiotemporal and environmental variation across a suite of SDMs. We first used simulated species distributions with known environmental preferences to compare three types of SDM: a machine learning model (boosted regression tree), a semi‐parametric model (generalized additive model) and a spatiotemporal mixed‐effects model (vector autoregressive spatiotemporal model, VAST). We then applied the same comparative framework to a case study with three fish species (arrowtooth flounder, pacific cod and walleye pollock) in the eastern Bering Sea, USA. Model type and covariate parameterization both had significant effects on model accuracy and performance. We found that including either spatiotemporal or environmental covariates typically reproduced patterns of species distribution and abundance across the three models tested, but model accuracy and performance was maximized when including both spatiotemporal and environmental covariates in the same model framework. Our results reveal trade‐offs in the current generation of SDM tools between accurately estimating species abundance, accurately estimating spatial patterns, and accurately quantifying underlying species–environment relationships. These comparisons between model types and parameterization options can help SDM users better understand sources of model bias and estimate error.