Open Access
Can long-term historical data from electronic medical records improve surveillance for epidemics of acute respiratory infections? A systematic evaluation
Author(s) -
Hongzhang Zheng,
William H. Woodall,
Abigail L. Carlson,
Sylvain DeLisle
Publication year - 2018
Publication title -
plos one
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.99
H-Index - 332
ISSN - 1932-6203
DOI - 10.1371/journal.pone.0191324
Subject(s) - outbreak , autoregressive integrated moving average , public health surveillance , medicine , false alarm , electronic surveillance , time series , medical record , computer science , statistics , medical emergency , public health , pediatrics , artificial intelligence , virology , machine learning , computer security , surgery , mathematics , pathology
Background As the deployment of electronic medical records (EMR) expands, so is the availability of long-term datasets that could serve to enhance public health surveillance. We hypothesized that EMR-based surveillance systems that incorporate seasonality and other long-term trends would discover outbreaks of acute respiratory infections (ARI) sooner than systems that only consider the recent past. Methods We simulated surveillance systems aimed at discovering modeled influenza outbreaks injected into backgrounds of patients with ARI. Backgrounds of daily case counts were either synthesized or obtained by applying one of three previously validated ARI case-detection algorithms to authentic EMR entries. From the time of outbreak injection, detection statistics were applied daily on paired background+injection and background-only time series. The relationship between the detection delay (the time from injection to the first alarm uniquely found in the background+injection data) and the false-alarm rate (FAR) was determined by systematically varying the statistical alarm threshold. We compared this relationship for outbreak detection methods that utilized either 7 days (early aberrancy reporting system (EARS)) or 2–4 years of past data (seasonal autoregressive integrated moving average (SARIMA) time series modeling). Results In otherwise identical surveillance systems, SARIMA detected epidemics sooner than EARS at any FAR below 10%. The algorithms used to detect single ARI cases impacted both the feasibility and marginal benefits of SARIMA modeling. Under plausible real-world conditions, SARIMA could reduce detection delay by 5–16 days. It also was more sensitive at detecting the summer wave of the 2009 influenza pandemic. Conclusion Time series modeling of long-term historical EMR data can reduce the time it takes to discover epidemics of ARI. Realistic surveillance simulations may prove invaluable to optimize system design and tuning.