
Balancing transferability and complexity of species distribution models for rare species conservation
Author(s) -
Helmstetter Nolan A.,
Conway Courtney J.,
Stevens Bryan S.,
Goldberg Amanda R.
Publication year - 2021
Publication title -
diversity and distributions
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.918
H-Index - 118
eISSN - 1472-4642
pISSN - 1366-9516
DOI - 10.1111/ddi.13174
Subject(s) - transferability , threatened species , cross validation , computer science , biological dispersal , species distribution , range (aeronautics) , environmental niche modelling , habitat , regularization (linguistics) , sample size determination , rare species , ecology , data mining , machine learning , statistics , artificial intelligence , ecological niche , biology , mathematics , population , logit , materials science , demography , sociology , composite material
Aim Species distribution models (SDMs) are valuable for rare species conservation and are commonly used to extrapolate predictions of habitat suitability geographically to regions where species occurrence is unknown (i.e., transferability). Spatially structured cross‐validation can be used to infer transferability, yet, few studies have evaluated how delineation of cross‐validation folds affects model complexity and predictions. We developed SDMs using multiple cross‐validation approaches to understand the implications for predicting habitat suitability for northern Idaho ground squirrels, a rare, federally threatened species that has been extensively surveyed in regions where known populations occur, resulting in >8000 presence locations. Location Idaho, USA. Methods We delineated cross‐validation folds by mimicking the manner in which predictions would be geographically extrapolated or by using existing dispersal barriers. We varied the distance between, number, and directionality of folds. We conducted a grid search on statistical regularization parameters to optimize model complexity, covering a range of values exceeding that typically implemented. For each cross‐validation approach, we selected optimal regularization and model complexity based on out‐of‐sample predictive ability. Results Delineation of cross‐validation folds substantially affected resulting model complexity and extrapolated predictions. All cross‐validation approaches resulted in models with apparently high out‐of‐sample predictive ability, yet optimal model complexity varied substantially among the approaches. Regularization demonstrated a noisy relationship between model complexity and prediction, where local optima in predictive performance were common at small values. Main conclusion Subtle modelling decisions can have large consequences for predictions of habitat suitability and transferability of SDMs. When transferability is the goal, cross‐validation approaches should be considered carefully and mimic the manner in which spatial extrapolation will occur, else overly complex models with inflated assessments of predictive accuracy may result. Further, spatially structured cross‐validation may not guard against over‐parameterization, and assessing a broader range of regularization parameters may be necessary to optimize model complexity for transferability.