Premium
Analysis of multiple‐variable missing‐not‐at‐random survey data for child lead surveillance using NHANES
Author(s) -
Roberts Eric M.,
English Paul B.
Publication year - 2016
Publication title -
statistics in medicine
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.996
H-Index - 183
eISSN - 1097-0258
pISSN - 0277-6715
DOI - 10.1002/sim.7067
Subject(s) - national health and nutrition examination survey , missing data , context (archaeology) , multivariate statistics , survey data collection , statistics , poverty , medicine , demography , econometrics , geography , environmental health , population , mathematics , archaeology , sociology , economic growth , economics
Background Although ongoing, multi‐topic surveys form the basis of public health surveillance in many countries, their utility for specific subject matter areas can be limited by high proportions of missing data. For example, the National Health and Examination Survey is the main resource for surveillance of elevated blood lead levels (EBLLs) in US children, but key predictor variables are missing for as many as 35% of respondents. Methods Using a Bayesian framework, we formulate a t ‐distributed Heckman selection model applicable to the case of multiple missing‐not‐at‐random variables in the context of a complex survey design. We demonstrate the utility of the results by calculating prevalence estimates for lead levels exceeding 2.5, 5.0, and 10.0 µg/dL among children 1 to 5 years of age for a variety of time points and geographies by applying the coefficients to data from the American Community Survey from the US Census. Results We present a protocol for estimating posterior distributions of parameters using Gibbs and grid sampling steps. Stark disparities in the prevalence of EBLL by race/ethnicity, age of housing, and poverty are readily quantified, and three‐ to five‐fold differences in predicted prevalence across geographies within the US are presented. Conclusions We are able to conduct multivariate analyses of EBLLs that incorporate the crucial variable age of housing, analyses that have not been previously available using these data. This represents an expansion of the utility of National Health and Examination Survey that is likely to be relevant to many similar ongoing, multi‐topic health surveillance efforts. Copyright © 2016 John Wiley & Sons, Ltd.