Premium
Applying sample survey methods to clinical trials data
Author(s) -
LaVange L. M.,
Koch G. G.,
Schwartz T. A.
Publication year - 2001
Publication title -
statistics in medicine
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.996
H-Index - 183
eISSN - 1097-0258
pISSN - 0277-6715
DOI - 10.1002/sim.732
Subject(s) - statistics , cluster analysis , computer science , sample size determination , intraclass correlation , sample (material) , econometrics , multivariate statistics , data mining , statistical model , sampling (signal processing) , sampling design , regression analysis , random effects model , logistic regression , mathematics , medicine , population , psychometrics , chemistry , meta analysis , environmental health , filter (signal processing) , chromatography , computer vision
Abstract This paper outlines the utility of statistical methods for sample surveys in analysing clinical trials data. Sample survey statisticians face a variety of complex data analysis issues deriving from the use of multi‐stage probability sampling from finite populations. One such issue is that of clustering of observations at the various stages of sampling. Survey data analysis approaches developed to accommodate clustering in the sample design have more general application to clinical studies in which repeated measures structures are encountered. Situations where these methods are of interest include multi‐visit studies where responses are observed at two or more time points for each patient, multi‐period cross‐over studies, and epidemiological studies for repeated occurrences of adverse events or illnesses. We describe statistical procedures for fitting multiple regression models to sample survey data that are more effective for repeated measures studies with complicated data structures than the more traditional approaches of multivariate repeated measures analysis. In this setting, one can specify a primary sampling unit within which repeated measures have intraclass correlation. This intraclass correlation is taken into account by sample survey regression methods through robust estimates of the standard errors of the regression coefficients. Regression estimates are obtained from model fitting estimation equations which ignore the correlation structure of the data (that is, computing procedures which assume that all observational units are independent or are from simple random samples). The analytic approach is straightforward to apply with logistic models for dichotomous data, proportional odds models for ordinal data, and linear models for continuously scaled data, and results are interpretable in terms of population average parameters. Through the features summarized here, the sample survey regression methods have many similarities to the broader family of methods based on generalized estimating equations (GEE). Sample survey methods for the analysis of time‐to‐event data have more recently been developed and implemented in the context of finite probability sampling. Given the importance of survival endpoints in late phase studies for drug development, these methods have clear utility in the area of clinical trials data analysis. A brief overview of methods for sample survey data analysis is first provided, followed by motivation for applying these methods to clinical trials data. Examples drawn from three clinical studies are provided to illustrate survey methods for logistic regression, proportional odds regression and proportional hazards regression. Potential problems with the proposed methods and ways of addressing them are discussed. Copyright © 2001 John Wiley & Sons, Ltd.