Premium
ENM eval: An R package for conducting spatially independent evaluations and estimating optimal model complexity for Maxent ecological niche models
Author(s) -
Muscarella Robert,
Galante Peter J.,
SoleyGuardia Mariano,
Boria Robert A.,
Kass Jamie M.,
Uriarte María,
Anderson Robert P.
Publication year - 2014
Publication title -
methods in ecology and evolution
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.425
H-Index - 105
ISSN - 2041-210X
DOI - 10.1111/2041-210x.12261
Subject(s) - jackknife resampling , akaike information criterion , computer science , environmental niche modelling , sample size determination , statistics , data mining , r package , ecological niche , ecology , mathematics , machine learning , biology , estimator , habitat
Summary Recent studies have demonstrated a need for increased rigour in building and evaluating ecological niche models ( ENM s) based on presence‐only occurrence data. Two major goals are to balance goodness‐of‐fit with model complexity (e.g. by ‘tuning’ model settings) and to evaluate models with spatially independent data. These issues are especially critical for data sets suffering from sampling bias, and for studies that require transferring models across space or time (e.g. responses to climate change or spread of invasive species). Efficient implementation of procedures to accomplish these goals, however, requires automation. We developed ENM eval , an R package that: (i) creates data sets for k ‐fold cross‐validation using one of several methods for partitioning occurrence data (including options for spatially independent partitions), (ii) builds a series of candidate models using Maxent with a variety of user‐defined settings and (iii) provides multiple evaluation metrics to aid in selecting optimal model settings. The six methods for partitioning data are n −1 jackknife, random k ‐folds ( = bins), user‐specified folds and three methods of masked geographically structured folds. ENM eval quantifies six evaluation metrics: the area under the curve of the receiver‐operating characteristic plot for test localities ( AUC TEST ), the difference between training and testing AUC ( AUC DIFF ), two different threshold‐based omission rates for test localities and the Akaike information criterion corrected for small sample sizes ( AIC c). We demonstrate ENM eval by tuning model settings for eight tree species of the genus Coccoloba in Puerto Rico based on AIC c. Evaluation metrics varied substantially across model settings, and models selected with AIC c differed from default ones. In summary, ENMeval facilitates the production of better ENM s and should promote future methodological research on many outstanding issues.