Premium
Correction for misclassification of a categorized exposure in binary regression using replication data
Author(s) -
Dalen Ingvild,
Buonaccorsi John P.,
Sexton Joseph A.,
Laake Petter,
Thoresen Magne
Publication year - 2009
Publication title -
statistics in medicine
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.996
H-Index - 183
eISSN - 1097-0258
pISSN - 0277-6715
DOI - 10.1002/sim.3712
Subject(s) - statistics , framingham heart study , replicate , regression , odds , parametric statistics , covariate , data set , regression analysis , econometrics , computer science , standard error , logistic regression , binary data , statistic , type i and type ii errors , nonparametric statistics , mathematics , medicine , binary number , framingham risk score , disease , arithmetic
Continuous epidemiologic exposure data are often categorized according to one or more cut points before inclusion in a regression analysis involving some outcome variable. If the original data are subject to measurement error, the categorized data will be afflicted with misclassification, which is differential, and which induces biases in naïve methods that ignore the misclassification. We propose a method for measurement error adjustment in these settings, when there are replicate data available on the original measurements, and when the outcome variable is dichotomous. Working on the continuous measurements, conditional densities of the exposure given the outcome are estimated and used to obtain odds ratios. The estimation of densities is done either parametrically or nonparametrically. The method is compared with the naïve approach of simply categorizing the erroneous mean measurements in simulation studies, and although the nonparametric method is more variable, it has the best overall performance, the greatest differences being observed in settings where the effects and/or the measurement errors are large. The performance of the parametric method is highly dependent on the model fit. Applying the methods to a real‐life data set from the Framingham Heart Study produced larger estimated odds ratios for coronary heart disease as a result of elevated systolic blood pressure, as compared with naïve odds ratios. We provide some discussion of alternative procedures that might be considered including regression calibration, SIMEX and the use of estimated misclassification probabilities. Copyright © 2009 John Wiley & Sons, Ltd.