Premium
A Bayesian spatial categorical model for prediction to overlapping geographical areas in sample surveys
Author(s) -
Shuvo Bakar K.,
Biddle Nicholas,
Kokic Philip,
Jin Huidong
Publication year - 2020
Publication title -
journal of the royal statistical society: series a (statistics in society)
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.103
H-Index - 84
eISSN - 1467-985X
pISSN - 0964-1998
DOI - 10.1111/rssa.12526
Subject(s) - categorical variable , small area estimation , geography , bayesian probability , sample (material) , geographic information system , cartography , set (abstract data type) , spatial analysis , census , data mining , computer science , statistics , econometrics , machine learning , mathematics , remote sensing , population , demography , estimator , sociology , programming language , chemistry , chromatography
Summary Motivated by the Australian National University poll, we consider a situation where survey data have been collected from respondents for several categorical variables and a primary geographic classification, e.g. postcode. Here, a common and important problem is to obtain estimates for a second target geography that overlaps with the primary geography but has not been collected from the respondents. We examine this problem when areal level census information is available for both geographic classifications. Such a situation is challenging from a small area estimation perspective for several reasons: there is a misalignment between the census and survey information as well as the geographical classifications; the geographic areas are potentially small and so prediction can be difficult because of the sparse or spatially missing data issue; and there is the possibility of non‐stationary spatial dependence. To address these problems we develop a Bayesian model using latent processes, underpinned by a non‐stationary spatial basis that combines Moran's I and multiresolution basis functions with a small but representative set of knots. The study results based on simulated data demonstrate that the model can be highly effective and gives more accurate estimates for areas defined by the target geography than several existing models. The model also performs well for the Australian National University poll data to predict on a second geographic classification: statistical area level 2.