Premium
Health indicators: Eliminating bias from convenience sampling estimators
Author(s) -
Hedt Bethany L.,
Pagano Marcello
Publication year - 2011
Publication title -
statistics in medicine
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.996
H-Index - 183
eISSN - 1097-0258
pISSN - 0277-6715
DOI - 10.1002/sim.3920
Subject(s) - sample (material) , inference , computer science , estimator , population , sample size determination , simple random sample , data collection , sampling bias , data science , statistics , data mining , econometrics , medicine , mathematics , environmental health , artificial intelligence , chemistry , chromatography
Public health practitioners are often called upon to make inference about a health indicator for a population at large when the sole available information are data gathered from a convenience sample, such as data gathered on visitors to a clinic. These data may be of the highest quality and quite extensive, but the biases inherent in a convenience sample preclude the legitimate use of powerful inferential tools that are usually associated with a random sample. In general, we know nothing about those who do not visit the clinic beyond the fact that they do not visit the clinic. An alternative is to take a random sample of the population. However, we show that this solution would be wasteful if it excluded the use of available information. Hence, we present a simple annealing methodology that combines a relatively small, and presumably far less expensive, random sample with the convenience sample. This allows us to not only take advantage of powerful inferential tools, but also provides more accurate information than that available from just using data from the random sample alone. Copyright © 2011 John Wiley & Sons, Ltd.