z-logo
open-access-imgOpen Access
Audio Enhancement and Synthesis using Generative Adversarial Networks: A Survey
Author(s) -
Norberto Torres-Reyes,
Shahram Latifi
Publication year - 2019
Publication title -
international journal of computer applications
Language(s) - English
Resource type - Journals
ISSN - 0975-8887
DOI - 10.5120/ijca2019918334
Subject(s) - computer science , adversarial system , generative grammar , generative adversarial network , multimedia , artificial intelligence , speech recognition , deep learning
Generative adversarial networks (GAN) have become prominent in the field of machine learning. Their premise is based on a minimax game in which a generator and discriminator “compete” against each other until an optimal point is reached. The goal of the generator is to produce synthetic samples that match that of real data. The discriminator tries to classify the real data as real and the generated data as not real. Together, the generator improves to the point where the fake data and real data are identical to the discriminator. GAN has been successfully applied in the image processing field over a large range of GAN variant architectures. Although not as prominent, the audio enhancement and synthesis field has also benefitted from GAN in a variety of different forms. In this survey paper, different techniques involving GAN will be explored relative to speech synthesis, speech enhancement, music generation, and general audio synthesis. Strengths and weaknesses of GAN will be looked at including variants created to combat those weaknesses. Also, a few similar machine learning architectures will be explored that may help achieve promising results.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom