z-logo
open-access-imgOpen Access
Combining Information from Multiple Data Sources to Assess Population Health
Author(s) -
Trivellore E. Raghunathan,
Kaushik Ghosh,
Allison B. Rosen,
Paul Imbriano,
Susan T. Stewart,
Irina Bondarenko,
Kassandra L. Messer,
Patricia A. Berglund,
James Shaffer,
David M. Cutler
Publication year - 2020
Publication title -
journal of survey statistics and methodology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.717
H-Index - 15
eISSN - 2325-0992
pISSN - 2325-0984
DOI - 10.1093/jssam/smz047
Subject(s) - national health and nutrition examination survey , beneficiary , sample (material) , population health , population , survey data collection , medicine , survey sampling , outcome (game theory) , computer science , statistics , econometrics , actuarial science , environmental health , mathematics , business , chemistry , finance , chromatography , mathematical economics
Information about an extensive set of health conditions on a well-defined sample of subjects is essential for assessing population health, gauging the impact of various policies, modeling costs, and studying health disparities. Unfortunately, there is no single data source that provides accurate information about health conditions. We combine information from several administrative and survey data sets to obtain model-based dummy variables for 107 health conditions (diseases, preventive measures, and screening for diseases) for elderly (age 65 and older) subjects in the Medicare Current Beneficiary Survey (MCBS) over the fourteen-year period, 1999-2012. The MCBS has prevalence of diseases assessed based on Medicare claims and provides detailed information on all health conditions but is prone to underestimation bias. The National Health and Nutrition Examination Survey (NHANES), on the other hand, collects self-reports and physical/laboratory measures only for a subset of the 107 health conditions. Neither source provides complete information, but we use them together to derive model-based corrected dummy variables in MCBS for the full range of existing health conditions using a missing data and measurement error model framework. We create multiply imputed dummy variables and use them to construct the prevalence rate and trend estimates. The broader goal, however, is to use these corrected or modeled dummy variables for a multitude of policy analysis, cost modeling, and analysis of other relationships either using them as predictors or as outcome variables.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here