z-logo
open-access-imgOpen Access
GANs and VAEs As Methods of Synthetic Data Generation and Augmentation to Enhance Heart Disease Prediction
Author(s) -
Rohit Sahoo,
Vedang Naik,
Saurabh Singh,
Shaveta Malik
Publication year - 2021
Publication title -
international journal of engineering and advanced technology
Language(s) - English
Resource type - Journals
ISSN - 2249-8958
DOI - 10.35940/ijeat.b3263.1211221
Subject(s) - computer science , discriminative model , machine learning , artificial intelligence , generative grammar , identification (biology) , random forest , data mining , botany , biology
Heart disease instances are rising at an alarming rate, and it is critical and essential to predict any such ailments in advance. This is a challenging diagnostic that must be done accurately and swiftly. Lack of relevant data is often the impeding factor when it comes to various areas of research. Data augmentation is a strategy for improving the training of discriminative models that may be accomplished in a variety of ways. Deep generative models, which have recently advanced, now provide new approaches to enrich current data sets. Generative Models like Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs) are frequently used to generate high quality, realistic, synthetic data essential for machine learning algorithms as they play a critical role in various classification problems. In our case, we were provided with 304 rows of heart disease data to create a robust model for predicting the presence of an ailment in the patient. However, the identification of heart disease would not be efficient given the small amount of available training data. We used GAN, CGAN, and VAE to generate data to tackle this problem, thus augmenting the original data. This additional data will help in increasing the accuracy of the models created using the new dataset. We applied classification-based Machine Learning models such as Logistic Regression, Decision Trees, KNN, and Random Forest. We compared the accuracy of the said models, each of which was supplied with the original dataset and the augmented datasets that used the data generation techniques mentioned above. Our research suggests that using data generation techniques significantly boosts the accuracy of the machine learning techniques applied to them.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here