z-logo
Premium
Ecological prediction at macroscales using big data: Does sampling design matter?
Author(s) -
Soranno Patricia A.,
Cheruvelil Kendra Spence,
Liu Boyang,
Wang Qi,
Tan PangNing,
Zhou Jiayu,
King Katelyn B. S.,
McCullough Ian M.,
Stachelek Jemma,
Bartley Meridith,
Filstrup Christopher T.,
Hanks Ephraim M.,
Lapierre JeanFrançois,
Lottig Noah R.,
Schliep Erin M.,
Wagner Tyler,
Webster Katherine E.
Publication year - 2020
Publication title -
ecological applications
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.864
H-Index - 213
eISSN - 1939-5582
pISSN - 1051-0761
DOI - 10.1002/eap.2123
Subject(s) - sampling (signal processing) , simple random sample , sampling design , stratified sampling , computer science , set (abstract data type) , range (aeronautics) , data set , ecosystem , sample size determination , statistics , random forest , sampling bias , ecology , sample (material) , data mining , environmental science , machine learning , mathematics , artificial intelligence , biology , population , engineering , chemistry , demography , filter (signal processing) , chromatography , sociology , computer vision , programming language , aerospace engineering
Although ecosystems respond to global change at regional to continental scales (i.e., macroscales), model predictions of ecosystem responses often rely on data from targeted monitoring of a small proportion of sampled ecosystems within a particular geographic area. In this study, we examined how the sampling strategy used to collect data for such models influences predictive performance. We subsampled a large and spatially extensive data set to investigate how macroscale sampling strategy affects prediction of ecosystem characteristics in 6,784 lakes across a 1.8‐million‐km 2 area. We estimated model predictive performance for different subsets of the data set to mimic three common sampling strategies for collecting observations of ecosystem characteristics: random sampling design, stratified random sampling design, and targeted sampling. We found that sampling strategy influenced model predictive performance such that (1) stratified random sampling designs did not improve predictive performance compared to simple random sampling designs and (2) although one of the scenarios that mimicked targeted (non‐random) sampling had the poorest performing predictive models, the other targeted sampling scenarios resulted in models with similar predictive performance to that of the random sampling scenarios. Our results suggest that although potential biases in data sets from some forms of targeted sampling may limit predictive performance, compiling existing spatially extensive data sets can result in models with good predictive performance that may inform a wide range of science questions and policy goals related to global change.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here