
A review of evidence about use and performance of species distribution modelling ensembles like BIOMOD
Author(s) -
Hao Tianxiao,
Elith Jane,
GuilleraArroita Gurutzeta,
LahozMonfort José J.
Publication year - 2019
Publication title -
diversity and distributions
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.918
H-Index - 118
eISSN - 1472-4642
pISSN - 1366-9516
DOI - 10.1111/ddi.12892
Subject(s) - computer science , ensemble learning , popularity , ensemble forecasting , predictive modelling , process (computing) , distribution (mathematics) , taxon , environmental niche modelling , machine learning , ecology , artificial intelligence , data science , econometrics , data mining , ecological niche , mathematics , biology , psychology , mathematical analysis , habitat , social psychology , operating system
Aim The idea of combining predictions from different models into an ensemble has gained considerable popularity in species distribution modelling, partly due to free and comprehensive software such as the R package BIOMOD. However, despite proliferation of ensemble models, we lack oversight of how and where they are used for modelling distributions, and how well they perform. Here, we present such an overview. Location Global. Methods Since BIOMOD is freely available and widely used by ensemble species distribution modellers, we focused on articles that apply BIOMOD, filtering the initial 852 papers identified in our structured literature search to a relevant final subset of 224 eligible peer‐reviewed journal articles. Results BIOMOD‐based ensembles are used across many taxa and locations, with terrestrial plants being the most represented group of species ( n = 72) and Europe being the most represented continent ( n = 106). These studies often focus on forecasting distributions in the future ( n = 109), and commonly use presence‐only species data ( n = 139) and climatic environmental predictors ( n = 219). An average of six models are used in ensembles, and approximately half of ensembles weight contributions of models by their cross‐validation performance. However, discussion about choices made in the modelling process and unambiguous information on the performance of ensemble models versus individual models are limited. The use of independent data to validate model performance is particularly uncommon. Main conclusions We document the breadth of ensemble applications, but could not draw strong quantitative conclusions about the predictive performance of ensemble models, due to lack of unambiguous information reported. Understanding how and where ensembles are best used when modelling species distributions is important for enabling best choices for different applications. To enable this objective to be achieved, we provide recommendations for thorough reporting practices in a BIOMOD‐based ensemble workflow.