Using the Whole Cohort in the Analysis of Case-Cohort Data
Author(s) -
N. E. Breslow,
Thomas Lumley,
Christie M. Ballantyne,
Lloyd E. Chambless,
Michal Kulich
Publication year - 2009
Publication title -
american journal of epidemiology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 2.33
H-Index - 256
eISSN - 1476-6256
pISSN - 0002-9262
DOI - 10.1093/aje/kwp055
Subject(s) - cohort , medicine , cohort study , environmental health
Case-cohort data analyses often ignore valuable information on cohort members not sampled as cases or controls. The Atherosclerosis Risk in Communities (ARIC) study investigators, for example, typically report data for just the 10%-15% of subjects sampled for substudies of their cohort of 15,972 participants. Remaining subjects contribute to stratified sampling weights only. Analysis methods implemented in the freely available R statistical system (http://cran.r-project.org/) make better use of the data through adjustment of the sampling weights via calibration or estimation. By reanalyzing data from an ARIC study of coronary heart disease and simulations based on data from the National Wilms Tumor Study, the authors demonstrate that such adjustment can dramatically improve the precision of hazard ratios estimated for baseline covariates known for all subjects. Adjustment can also improve precision for partially missing covariates, those known for substudy participants only, when their values may be imputed with reasonable accuracy for the remaining cohort members. Links are provided to software, data sets, and tutorials showing in detail the steps needed to carry out the adjusted analyses. Epidemiologists are encouraged to consider use of these methods to enhance the accuracy of results reported from case-cohort analyses.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom