Premium
Adjusting for differential misclassification in matched case‐control studies utilizing health administrative data
Author(s) -
Högg Tanja,
Zhao Yinshan,
Gustafson Paul,
Petkau John,
Fisk John,
Marrie Ruth Ann,
Tremlett Helen
Publication year - 2019
Publication title -
statistics in medicine
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.996
H-Index - 183
eISSN - 1097-0258
pISSN - 0277-6715
DOI - 10.1002/sim.8203
Subject(s) - bayesian probability , observational study , bayes' theorem , computer science , disease , leverage (statistics) , econometrics , differential (mechanical device) , data mining , medicine , statistics , machine learning , artificial intelligence , mathematics , pathology , engineering , aerospace engineering
In epidemiological studies of secondary data sources, lack of accurate disease classifications often requires investigators to rely on diagnostic codes generated by physicians or hospital systems to identify case and control groups, resulting in a less‐than‐perfect assessment of the disease under investigation. Moreover, because of differences in coding practices by physicians, it is hard to determine the factors that affect the chance of an incorrectly assigned disease status. What results is a dilemma where assumptions of non‐differential misclassification are questionable but, at the same time, necessary to proceed with statistical analyses. This paper develops an approach to adjust exposure‐disease association estimates for disease misclassification, without the need of simplifying non‐differentiality assumptions, or prior information about a complicated classification mechanism. We propose to leverage rich temporal information on disease‐specific healthcare utilization to estimate each participant's probability of being a true case and to use these estimates as weights in a Bayesian analysis of matched case‐control data. The approach is applied to data from a recent observational study into the early symptoms of multiple sclerosis (MS), where MS cases were identified from Canadian health administrative databases and matched to population controls that are assumed to be correctly classified. A comparison of our results with those from non‐differentially adjusted analyses reveals conflicting inferences and highlights that ill‐suited assumptions of non‐differential misclassification can exacerbate biases in association estimates.