Audio Enhancement and Synthesis using Generative Adversarial Networks: A Survey
Document Type
Article
Publication Date
1-1-2019
Publication Title
International Journal of Computer Applications
Volume
182
Issue
35
First page number:
27
Last page number:
31
Abstract
Generative adversarial networks (GAN) have become prominent in the field of machine learning. Their premise is based on a minimax game in which a generator and discriminator “compete” against each other until an optimal point is reached. The goal of the generator is to produce synthetic samples that match that of real data. The discriminator tries to classify the real data as real and the generated data as not real. Together, the generator improves to the point where the fake data and real data are identical to the discriminator. GAN has been successfully applied in the image processing field over a large range of GAN variant architectures. Although not as prominent, the audio enhancement and synthesis field has also benefitted from GAN in a variety of different forms. In this survey paper, different techniques involving GAN will be explored relative to speech synthesis, speech enhancement, music generation, and general audio synthesis. Strengths and weaknesses of GAN will be looked at including variants created to combat those weaknesses. Also, a few similar machine learning architectures will be explored that may help achieve promising results.
Keywords
Generative Adversarial Networks; Survey; Audio Synthesis; Audio Enhancement; Audio; Synthesis; Generative Adversarial Networks; Survey; Enhancement
Disciplines
Electrical and Computer Engineering | Engineering
Language
English
Repository Citation
Latifi, S.,
Torres-Reyes, N.
(2019).
Audio Enhancement and Synthesis using Generative Adversarial Networks: A Survey.
International Journal of Computer Applications, 182(35),
27-31.