
Error analysis for hybrid estimates of proportions using big data1
Author(s) -
S. M. Tam,
Dennis Trewin,
Lyndon Ang
Publication year - 2022
Publication title -
statistical journal of the iaos
Language(s) - English
Resource type - Journals
eISSN - 1875-9254
pISSN - 1874-7655
DOI - 10.3233/sji-210924
Subject(s) - big data , computer science , data science , survey data collection , estimation , sample (material) , data mining , econometrics , statistics , mathematics , engineering , chemistry , systems engineering , chromatography
Big data, including administrative data, is seen as a new data source for official statistics especially given the increasing difficulty of getting acceptable response rates in sample surveys. It might be used directly or perhaps with the use of models to adjust for shortcomings in the big data. Hybrid estimates using complementary survey data are another technique for overcoming these shortcomings. To make decisions on how big data might be used, we need to understand the nature of the errors in the big data source. The paper describes an Error Framework for the analysis of errors in big data and hybrid estimates. The paper also describes the circumstances under which hybrid estimates will provide more accurate estimates than big data in isolation or survey data. A case study is provided to illustrate the application of hybrid estimates in practice. A potential application of hybrid estimation is also described to address the upward biases that often exist in epidemiological modelling.