Premium
Optimizing ensembles of small models for predicting the distribution of species with few occurrences
Author(s) -
Breiner Frank T.,
Nobis Michael P.,
Bergamini Ariel,
Guisan Antoine
Publication year - 2018
Publication title -
methods in ecology and evolution
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.425
H-Index - 105
ISSN - 2041-210X
DOI - 10.1111/2041-210x.12957
Subject(s) - computation , computer science , transferability , predictive modelling , approximate bayesian computation , machine learning , data mining , artificial intelligence , algorithm , logit , inference
Ensembles of Small Models ( ESM ) represent a novel strategy for species distribution modelling with few observations. ESM s are built by calibrating many small models and then averaging them into an ensemble model where the small models are weighted by their cross‐validated scores of predictive performance. In a previous paper (Breiner, Guisan, Bergamini, & Nobis, Methods in Ecology and Evolution , 6 , 1210–1218, 2015), we reported two major findings. First, ESM s proved largely superior to standard models in terms of model performance and transferability. Second, ESM s including different modelling techniques did not clearly improve model performance compared to single‐technique ESM s. However, ESM s often require a large computation effort, which can become problematic when modelling large numbers of species. Given the appealing new perspectives offered by ESM s, it is especially important to investigate if some techniques yield increased performance while saving computation time and thus could be predominantly used for building ESM s. Here, we present results from a reanalysis of a subset of the data used in Breiner et al. (2015). More specifically, we ran ESM s: (1) fitted with 10 modelling techniques separately (in Breiner et al., 2015 we used only three techniques); and (2) using various parameter options for each modelling technique (i.e., model tuning). We show that ESM s vary in model performance and computation time across techniques, and some techniques are advantageous in terms of optimizing model performance and computation time (i.e., GLM , CTA and ANN ). Including one of these modelling techniques could thus optimize computation time compared to using more computing‐intensive techniques like GBM . Next, we show that parameter tuning can improve performance and transferability of ESM s, but often at the cost of computation time. Parameter tuning could therefore be used when computing resources are not a limiting factor. These findings help improve the applicability and performance of ESM s when applied to large numbers of species.