Premium
Imputing Missing Race/Ethnicity in Pediatric Electronic Health Records: Reducing Bias with Use of U.S. Census Location and Surname Data
Author(s) -
Grundmeier Robert W.,
Song Lihai,
Ramos Mark J.,
Fiks Alexander G.,
Elliott Marc N.,
Fremont Allen,
Pace Wilson,
Wasserman Richard C.,
Localio Russell
Publication year - 2015
Publication title -
health services research
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.706
H-Index - 121
eISSN - 1475-6773
pISSN - 0017-9124
DOI - 10.1111/1475-6773.12295
Subject(s) - missing data , imputation (statistics) , ethnic group , census , race (biology) , demography , medicine , statistics , population , mathematics , environmental health , sociology , gender studies , anthropology
Objective To assess the utility of imputing race/ethnicity using U.S. Census race/ethnicity, residential address, and surname information compared to standard missing data methods in a pediatric cohort. Data Sources/Study Setting Electronic health record data from 30 pediatric practices with known race/ethnicity. Study Design In a simulation experiment, we constructed dichotomous and continuous outcomes with pre‐specified associations with known race/ethnicity. Bias was introduced by nonrandomly setting race/ethnicity to missing. We compared typical methods for handling missing race/ethnicity (multiple imputation alone with clinical factors, complete case analysis, indicator variables) to multiple imputation incorporating surname and address information. Principal Findings Imputation using U.S. Census information reduced bias for both continuous and dichotomous outcomes. Conclusions The new method reduces bias when race/ethnicity is partially, nonrandomly missing.