
Prediction of Chronic Kidney Disease Using Machine Learning
Author(s) -
Himani Hatwar,
Divya Chhaprwal,
Dvarkesh Rokade,
Santosh Kale
Publication year - 2022
Publication title -
international journal of advanced research in science, communication and technology
Language(s) - English
Resource type - Journals
ISSN - 2581-9429
DOI - 10.48175/ijarsct-2586
Subject(s) - feature selection , artificial intelligence , support vector machine , random forest , classifier (uml) , regression , computer science , logistic regression , machine learning , pattern recognition (psychology) , linear regression , kidney disease , statistics , mathematics , medicine
Chronic Kidney Disease is one of the most serious illnesses nowadays, and it is vital to have a good diagnosis as soon as possible. Machine learning has proven to be effective in medical therapy. The doctor can diagnose the ailment early with the use of machine learning classifier algorithms. This article has examined Chronic Kidney Disease prediction from this standpoint. The Chronic Kidney Disease dataset was obtained from the University of California at Irvine's repository. The artificial neural network, C5.0, Chi-square Automatic interaction detector, logistic regression, linear support vector machine with penalty L1 & with penalty L2, and random forest classifier techniques were used in this study. The dataset was also subjected to the significant feature selection technique. The results were computed for each classifier using I full features, (ii) correlation-based feature selection, (iii) Wrapper method feature selection, (iv) Least absolute shrinkage and selection operator regression, (v) synthetic minority over-sampling technique with least absolute shrinkage and selection operator regression selected features, and (vi) synthetic minority over-sampling technique with full features. The results show that in synthetic minority over-sampling technique with full features, LSVM with penalty L2 has the maximum accuracy of 98.86 percent. Along with precision, recall, F-measure, and area, accuracy, precision, recall, and area. The GINI coefficient and beneath the curve have been computed, and the results of various algorithms have been compared in the graph. After synthetic minority over-sampling technique with full features, the least absolute shrinkage and selection operator regression selected features with synthetic minority over-sampling approach produced the best results. Again, the linear support vector machine had the maximum accuracy of 98.46 percent in the synthetic minority over-sampling technique with the least absolute shrinkage and selection operator selected features. Along with machine learning models, a deep neural network was used on the same dataset, and the deep neural network was found to have the greatest accuracy of 99.6%.