z-logo
open-access-imgOpen Access
Applying Statistical Machine Learning Methods to Analysis Differences in the Severity Level of COVID-19 among Countries
Author(s) -
Wen Yin,
Chenchen Pan,
Nanyi Deng,
Dong Jin Ji
Publication year - 2021
Publication title -
journal of software
Language(s) - English
Resource type - Journals
ISSN - 1796-217X
DOI - 10.17706/jsw.16.5.219-234
Subject(s) - gross domestic product , mean squared error , per capita , life expectancy , outlier , population , statistics , computer science , machine learning , econometrics , artificial intelligence , mathematics , demography , economics , economic growth , sociology
The COVID-19 pandemic has caused a significant negative impact on countries around the world, and there appears to be an observable difference in severity among nations. This study aims to provide an insight into the roles many social and economic factors played in contributing to this variation. By investigating potential patterns through exploratory data analysis, followed by constructing models using several popular machine learning techniques, we examine the validity of the underlying assumptions and identifying any potential limitations. Total deaths per million population is used as dependent variable with log transformation to remove outliers. A set of factors such as life expectancy, unemployment rate and population are available in the dataset. After removing and transforming outliers, various machine learning methods with cross validation are implemented and the optimal model is determined by predefined metrics such as root-mean-squared-error (RMSE) and mean-squared-error (MAE). The results show that the Gradient Boost Machine (GBM) technique achieves the most optimal results in terms of minimum RMSE and MAE. The RMSE and MAE values indicate no over fitting issues and the GBM algorithm captures the most influential factors such as life expectancy, healthcare expense per Gross Domestic Product (GDP) and GDP per capita, which are clearly critical explanatory variables for predicting total deaths per million population.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here