z-logo
Premium
Modelling imperfect presence data obtained by citizen science
Author(s) -
Mengersen Kerrie,
Peterson Erin E.,
Clifford Samuel,
Ye Nan,
Kim June,
Bednarz Tomasz,
Brown Ross,
James Allan,
Vercelloni Julie,
Pearse Alan R.,
Davis Jacqueline,
Hunter Vanessa
Publication year - 2017
Publication title -
environmetrics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.68
H-Index - 58
eISSN - 1099-095X
pISSN - 1180-4009
DOI - 10.1002/env.2446
Subject(s) - citizen science , computer science , imperfect , sample (material) , variance (accounting) , constraint (computer aided design) , data set , data quality , econometrics , sample size determination , data science , set (abstract data type) , data mining , statistics , artificial intelligence , mathematics , linguistics , philosophy , botany , geometry , accounting , operations management , chromatography , economics , business , biology , programming language , metric (unit) , chemistry
There is growing awareness about the potential benefit of harnessing citizen science for research, particularly in the biological and environmental sciences. Data quality is a major constraint in the use of citizen‐science data, in particular, imperfect observations. In this paper, we fit species distribution models to presence‐only data (presences and counts, with no absences observed) by exploiting the uncertainty in reported presences, instead of generating pseudo‐absences as is common in previous presence‐only studies. This approach allowed us to extend the suite of models to include those commonly fit to presence/absence and abundance data. We fit several models to a case study data set of jaguar encounters reported by citizens in the Peruvian Amazon. The true species distribution for the case study data is unknown, and thus we also undertake an extensive simulation study to evaluate model performance. We analyze the sources of error by studying the bias and variance of the models and discuss the predictive performance of each model and its ability to recover the true species distribution. The simulation study shows that, although several approaches are capable of recovering the species distribution, the choice of a modelling approach is a complex one and depends on factors such as inferential aim, model complexity, sample size, and computational resources. This study also addresses some issues in dealing with compound‐imperfect observations arising from citizen‐science data, and we discuss further steps needed in this research area.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here