z-logo
open-access-imgOpen Access
Prediction of Diabetics using Machine Learning
Author(s) -
G. Geetha,
Dr.K.Mohana Prasad
Publication year - 2020
Publication title -
international journal of recent technology and engineering
Language(s) - English
Resource type - Journals
ISSN - 2277-3878
DOI - 10.35940/ijrte.e6290.018520
Subject(s) - naive bayes classifier , random forest , categorization , bayes' theorem , machine learning , set (abstract data type) , data set , artificial intelligence , computer science , tamil , process (computing) , data mining , statistics , mathematics , support vector machine , bayesian probability , linguistics , philosophy , programming language , operating system
Around 50.9 Million People in India suffer from diabetics and Tamil Nadu stands second in the list of Indian states. The main objective of this paper is to develop prediction modeling of the given medical data of patients with and without diabetics. Through this paper, we aim to create hybrid models that can be easily used by doctors to treat patients with diabetics. Naïve Bayes and Random forest algorithms are used to predict whether a person having diabetics or not, by keeping his health conditions in mind. Thus this process enables doctors to easily group, classify and categorize the disease type accordingly treatment can be given to them. We split the Dataset into 1) Training set and 2) Testing Set and perform analysis on them. The Pima Indian dataset was used to study and analyze the data, alongside with data mining techniques. It is the data obtained from the National Institute for Diabetics patients which contains n number of medical predictor variables and one target variable. Initially, we replace the null values that are there in the dataset with the mean values of the respective columns. We then split the dataset into different ways to perform analysis on them: 85/15, 80/20, 70/30, 60/40. After procuring the data set, we apply Naïve Bayes and Random Forest algorithms on this. The Naïve Bayes algorithm is used here to find the probability of the independent features/columns. The data set is given as an input and the prediction takes place according to the NB Model. The Random Forest algorithm is used here in order to perform feature selection. It takes n inputs from the dataset and builds numerous uncorrelated decision trees during the time of training. It then displays the class that is the mode of all of the class outputs by individual trees.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here