Premium
A big data approach to the development of mixed‐effects models for seizure count data
Author(s) -
Tharayil Joseph J.,
Chiang Sharon,
Moss Robert,
Stern John M.,
Theodore William H.,
Goldenholz Daniel M.
Publication year - 2017
Publication title -
epilepsia
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 2.687
H-Index - 191
eISSN - 1528-1167
pISSN - 0013-9580
DOI - 10.1111/epi.13727
Subject(s) - count data , negative binomial distribution , covariate , overdispersion , autocorrelation , generalized linear mixed model , poisson distribution , statistics , bayesian probability , generalized linear model , linear model , epilepsy , poisson regression , variance (accounting) , computer science , econometrics , medicine , mathematics , population , environmental health , accounting , psychiatry , business
Summary Objective Our objective was to develop a generalized linear mixed model for predicting seizure count that is useful in the design and analysis of clinical trials. This model also may benefit the design and interpretation of seizure‐recording paradigms. Most existing seizure count models do not include children, and there is currently no consensus regarding the most suitable model that can be applied to children and adults. Therefore, an additional objective was to develop a model that accounts for both adult and pediatric epilepsy. Methods Using data from SeizureTracker.com , a patient‐reported seizure diary tool with >1.2 million recorded seizures across 8 years, we evaluated the appropriateness of Poisson, negative binomial, zero‐inflated negative binomial, and modified negative binomial models for seizure count data based on minimization of the Bayesian information criterion. Generalized linear mixed‐effects models were used to account for demographic and etiologic covariates and for autocorrelation structure. Holdout cross‐validation was used to evaluate predictive accuracy in simulating seizure frequencies. Results For both adults and children, we found that a negative binomial model with autocorrelation over 1 day was optimal. Using holdout cross‐validation, the proposed model was found to provide accurate simulation of seizure counts for patients with up to four seizures per day. Significance The optimal model can be used to generate more realistic simulated patient data with very few input parameters. The availability of a parsimonious, realistic virtual patient model can be of great utility in simulations of phase II / III clinical trials, epilepsy monitoring units, outpatient biosensors, and mobile Health (mHealth) applications.