
Machine Learning Models for Diagnostic Classification of Hepatitis C Tests
Author(s) -
Oladosu Oyebisi Oladimeji,
Abimbola Oladimeji,
Olanrewaju Oladimeji
Publication year - 2021
Publication title -
frontiers in health informatics
Language(s) - English
Resource type - Journals
ISSN - 2676-7104
DOI - 10.30699/fhi.v10i1.274
Subject(s) - receiver operating characteristic , oversampling , random forest , artificial intelligence , machine learning , matthews correlation coefficient , precision and recall , correlation coefficient , medicine , statistics , computer science , mathematics , support vector machine , telecommunications , bandwidth (computing)
Hepatitis C is a chronic infection caused by hepatitis c virus - a blood borne virus. Therefore, the infection occurs through exposure to small quantities of blood. It has been estimated by World Health Organization (WHO) to have affected 71 million people worldwide. This infection costs individual, groups and government a lot because no vaccine has been gotten yet for the treatment. This disease is likely to continue to affect more people because it’s long asymptotic phase which makes its early detection not feasible.Material and Methods: In this study, we have presented machine learning models to automatically classify the diagnosis test of hepatitis and also ranked the test features in order to know how they contribute to the classification which help in decision making process by the health care industry. The synthetic minority oversampling technique (SMOTE) was used to solve the problem of imbalance dataset.Results: The models were evaluated based on metrics such as Matthews correlation coefficient, F-measure, Precision-Recall curve and Receiver Operating Characteristic Area Under Curve. We found that using SMOTE techniques helped raise performance of the predictive models. Also, random forest (RF) had the best performance based on Matthews correlation coefficient (0.99), F-measure (0.99), Precision-Recall curve (1.00) and Receiver Operating Characteristic Area Under Curve (0.99).Conclusion: This discovery has the potential to impact on clinical practice, when health workers aim at classifying diagnosis result of disease at its early stage.