Premium
Nested case‐control data analysis using weighted conditional logistic regression in The Environmental Determinants of Diabetes in the Young (TEDDY) study: A novel approach
Author(s) -
Lee HyeSeung,
Lynch Kristian F.,
Krischer Jeffrey P.
Publication year - 2020
Publication title -
diabetes/metabolism research and reviews
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.307
H-Index - 110
eISSN - 1520-7560
pISSN - 1520-7552
DOI - 10.1002/dmrr.3204
Subject(s) - statistics , logistic regression , inverse probability weighting , context (archaeology) , weighting , regression analysis , regression , regression dilution , confidence interval , econometrics , logistic model tree , selection (genetic algorithm) , mathematics , computer science , polynomial regression , artificial intelligence , medicine , paleontology , propensity score matching , radiology , biology
Background A nested case‐control (NCC) design within a prospective cohort study can realize substantial benefits for biomarker studies. In this context, it is natural to consider the sample availability in the selection of controls to minimize data loss when implementing the design. However, this violates the randomness required for selection, and it leads to biased analyses. An inverse probability weighting may improve the analysis, but the current approach using weighted Cox regression fails to maintain the benefits of NCC design. Methods This paper introduces weighted conditional logistic regression. We illustrate our proposed analysis using data recently investigated in The Environmental Determinants of Diabetes in the Young (TEDDY). Considering the potential data loss, the TEDDY NCC design was moderately selective in its selection of controls. A data‐driven simulation study was performed to present the bias correction when a nonrandom control selection was ignored in the analysis. Results The TEDDY data analysis showed that the standard analysis using conditional logistic regression estimated the parameter: −0.015 (−0.023, −0.007). The biased estimate using Cox regression was −0.011 (95% confidence interval: −0.019, −0.003). Weighted Cox regression estimated −0.013 (−0.026, 0.0004). The proposed weighted conditional logistic regression estimated −0.020 (−0.033, −0.007), showing a stronger negative effect size than the one using conditional logistic regression. The simulation study also showed that the standard estimate of β ignoring the nonrandom control selection tends to be greater than the true β (ie, positive relative biases). Conclusion Weighted conditional logistic regression can enhance the analysis by offering flexibility in the selection of controls, while maintaining the matching.