z-logo
Premium
Comparing Inference Methods for Non‐probability Samples
Author(s) -
Buelens Bart,
Burger Joep,
Brakel Jan A.
Publication year - 2018
Publication title -
international statistical review
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.051
H-Index - 54
eISSN - 1751-5823
pISSN - 0306-7734
DOI - 10.1111/insr.12253
Subject(s) - inference , computer science , statistical inference , sampling (signal processing) , big data , data mining , population , machine learning , the internet , predictive inference , data science , econometrics , frequentist inference , artificial intelligence , statistics , mathematics , bayesian inference , bayesian probability , demography , filter (signal processing) , sociology , world wide web , computer vision
Summary Social and economic scientists are tempted to use emerging data sources like big data to compile information about finite populations as an alternative for traditional survey samples. These data sources generally cover an unknown part of the population of interest. Simply assuming that analyses made on these data are applicable to larger populations is wrong. The mere volume of data provides no guarantee for valid inference. Tackling this problem with methods originally developed for probability sampling is possible but shown here to be limited. A wider range of model‐based predictive inference methods proposed in the literature are reviewed and evaluated in a simulation study using real‐world data on annual mileages by vehicles. We propose to extend this predictive inference framework with machine learning methods for inference from samples that are generated through mechanisms other than random sampling from a target population. Describing economies and societies using sensor data, internet search data, social media and voluntary opt‐in panels is cost‐effective and timely compared with traditional surveys but requires an extended inference framework as proposed in this article.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here