
Influenza forecasting for French regions combining EHR, web and climatic data sources with a machine learning ensemble approach
Author(s) -
Canelle Poirier,
Yulin Hswen,
Guillaume Bouzillé,
Marc Cuggia,
Audrey Lavenu,
John S. Brownstein,
Thomas G. Brewer,
Mauricio Santillana
Publication year - 2021
Publication title -
plos one
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.99
H-Index - 332
ISSN - 1932-6203
DOI - 10.1371/journal.pone.0250890
Subject(s) - disease surveillance , computer science , public health , data science , disease , public health surveillance , population health , machine learning , outbreak , psychological intervention , population , artificial intelligence , medicine , environmental health , nursing , pathology , virology , psychiatry
Effective and timely disease surveillance systems have the potential to help public health officials design interventions to mitigate the effects of disease outbreaks. Currently, healthcare-based disease monitoring systems in France offer influenza activity information that lags real-time by one to three weeks. This temporal data gap introduces uncertainty that prevents public health officials from having a timely perspective on the population-level disease activity. Here, we present a machine-learning modeling approach that produces real-time estimates and short-term forecasts of influenza activity for the twelve continental regions of France by leveraging multiple disparate data sources that include, Google search activity, real-time and local weather information, flu-related Twitter micro-blogs, electronic health records data, and historical disease activity synchronicities across regions. Our results show that all data sources contribute to improving influenza surveillance and that machine-learning ensembles that combine all data sources lead to accurate and timely predictions.