Open Access
Comparison Decision Tree and Logistic Regression Machine Learning Classification Algorithms to determine Covid-19
Author(s) -
Artika Arista
Publication year - 2022
Publication title -
sinkron
Language(s) - English
Resource type - Journals
eISSN - 2541-2019
pISSN - 2541-044X
DOI - 10.33395/sinkron.v7i1.11243
Subject(s) - logistic regression , decision tree , machine learning , covid-19 , artificial intelligence , python (programming language) , computer science , cross validation , algorithm , regression , sore throat , random forest , tree (set theory) , statistics , medicine , mathematics , infectious disease (medical specialty) , disease , surgery , operating system , mathematical analysis , pathology
Many people today are unsure whether they have COVID-19. The frequent fever, dry cough, and sore throat are all signs and symptoms of COVID-19. If a person has signs or symptoms of coronavirus disease 2019 (COVID-19), he/she should see the doctor or go to a clinic as soon as possible. As a result, it's vital to learn and comprehend the fundamental differences. COVID-19 can cause a wide range of symptoms. The experiments were carried out using two Machine Learning Classification Algorithms, namely Decision Tree (DT) and Logistic Regression (LR). Both algorithms were written and analyzed using the Python program in Jupyter Notebook 6.4.5. From the results obtained in the experiments of covid symptoms dataset, on average, the DT model has obtained the best cross-validation average and the testing performance average compared to the LR machine learning models. For cross-validation results, the DT model has achieved an accuracy of 98.0%. For performance testing, the DT model has achieved an accuracy of 98.0%. The LR has obtained the second-best result on the average of cross-validation performance and the testing results. For cross-validation results, the LR model has achieved an accuracy of 96.0%. For performance testing, the LR model has achieved an accuracy of 97.0%. Consequently, the DT for the COVID-19 symptoms dataset is outperforming the LR for cross-validation and testing results.