z-logo
Premium
Can Google Trends data improve forecasting of Lyme disease incidence?
Author(s) -
KapitányFövény Máté,
Ferenci Tamás,
Sulyok Zita,
Kegele Josua,
Richter Hardy,
VályiNagy István,
Sulyok Mihály
Publication year - 2019
Publication title -
zoonoses and public health
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.87
H-Index - 65
eISSN - 1863-2378
pISSN - 1863-1959
DOI - 10.1111/zph.12539
Subject(s) - mean absolute percentage error , statistics , lyme disease , autoregressive integrated moving average , mean squared error , residual , incidence (geometry) , mathematics , time series , medicine , algorithm , geometry , virology
Background Online activity‐based epidemiological surveillance and forecasting is getting more and more attention. To date, Google search volumes have not been assessed for forecasting of tick‐borne diseases. Thus, we performed an analysis of forecasting of the Lyme disease incidence based on the traditional data extended with Google Trends. Methods Data on the weekly incidence of Lyme disease in Germany from 16 June 2013 to 27 May 2018 were obtained from the database of the Robert Koch Institute. Data of Internet searches were obtained from Google Trends searching “Borreliose” in Germany for the “last 5 years” as a timespan category. Data were split into the training (from 16 June 2013 to 11 June 2017) and validation (from 12 June 2017, to 27 May 2018) data sets. A seasonal autoregressive moving average model, SARIMA (0,1,1) (0,1,1) [52] model was selected to describe the time series of the weekly Lyme incidence. After this, we added the Google Trends data as an external regressor and identified the SARIMA (0,1,1) (0,1,1) [52] model as optimal. We made predictions for the validation interval using these two models and compared predictions with the values of the validation data set. Results Forecasting for the validation timespan resulted in similar values for the models. Comparing the forecasted values with the reported ones resulted in an residual mean squared error (RMSE) of 0.3763; the mean absolute percentage error (MAPE) was 8.233 for the model without Google searches with an RMSE of 0.3732; and the MAPE was 8.17495 for the Google Trends values‐expanded model. The difference between the predictive performances was insignificant (Diebold‐Mariano Test, p ‐value = 0.4152). Conclusion Google Trends data are a good correlate of the reported incidence of Lyme disease in Germany, but it failed to significantly improve the forecasting accuracy in models based on traditional data.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here