Open Access
Can we model the probability of presence of species without absence data?
Author(s) -
Li Wenkai,
Guo Qinghua,
Elkan Charles
Publication year - 2011
Publication title -
ecography
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 2.973
H-Index - 128
eISSN - 1600-0587
pISSN - 0906-7590
DOI - 10.1111/j.1600-0587.2011.06888.x
Subject(s) - conditional probability , covariate , principle of maximum entropy , probability distribution , probability model , statistics , entropy (arrow of time) , computer science , statistical model , ecology , mathematics , biology , physics , quantum mechanics
In ecological studies, it is useful to estimate the probability that a species occurs at given locations. The probability of presence can be modeled by traditional statistical methods, if both presence and absence data are available. However, the challenge is that most species records contain only presence data, without reliable absence data. Previous presence‐only methods can estimate a relative index of habitat suitability, but cannot estimate the actual probability of presence. In this study, we develop a presence and background learning algorithm (PBL) that is successful in modeling the conditional probability of presence of a simulated species. The model is trained by two completely separate sets: observed presence and background data. Assuming that the probability of presence is one for ‘prototypical presence’ locations where the habitats are maximally suitable for a species, we can estimate a constant that can calibrate the trained model into the actual probability of presence. Experimental results show that the PBL method performs similarly to a presence‐absence method, and significantly better than the widely used maximum entropy method. The new algorithm enables us to model the probability that a species occurs conditional on environmental covariates without absence data. Hence, it has potential to improve modeling of the geographical distributions of species.