Premium
IMPROVING THE PRECISION OF ESTIMATES OF THE FREQUENCY OF RARE EVENTS
Author(s) -
Dixon Philip M.,
Ellison Aaron M.,
Gotelli Nicholas J.
Publication year - 2005
Publication title -
ecology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 2.144
H-Index - 294
eISSN - 1939-9170
pISSN - 0012-9658
DOI - 10.1890/04-0601
Subject(s) - statistics , bayesian probability , mathematics , sample size determination , regression , probability distribution , event (particle physics) , population , physics , demography , quantum mechanics , sociology
The probability of a rare event is usually estimated directly as the number of times the event occurs divided by the total sample size. Unfortunately, the precision of this estimate is low. For typical sample sizes of N < 100 in ecological studies, the coefficient of variation ( cv ) of this estimate of the probability of a rare event can exceed 300%. Sample sizes on the order of 10 3 –10 4 observations are needed to reduce the cv to below 10%. If it is impractical or impossible to increase the sample size, auxiliary data can be used to improve the precision of the estimate. We describe four approaches for using auxiliary data to improve the precision of estimates of the probability of a rare event: (1) Bayesian analysis that includes prior information about the probability; (2) stratification that incorporates information on the heterogeneity in the population; (3) regression models that account for information correlated with the probability; and (4) inclusion of aggregated data collected at larger spatial or temporal scales. These approaches are illustrated using data on the probability of capture of vespulid wasps by the insectivorous plant Darlingtonia californica . All four methods increase the precision of the estimate relative to the simple frequency‐based estimate (absolute precision = 1.26, relative precision [ cv ] = 70%): stratification (absolute precision = 1.10, cv = 62%); regression models (absolute precision = 1.59, cv = 55%); Bayesian analysis with an informative prior probability distribution (absolute precision = 4.28, cv = 47%); and using temporally aggregated data (absolute precision = 6.75, cv = 36%). When informative auxiliary data is available, we recommend including it when estimating the probability of rare events.