Open Access
Sdgan: Improve Speech Enhancement Quality by Information Filter
Author(s) -
Xiaozhou Guo,
Yi Liu,
Wenyu Mao,
Jixing Li,
Wenchang Li,
Guoliang Gong,
Huaxiang Lu
Publication year - 2021
Publication title -
journal of physics. conference series
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.21
H-Index - 85
eISSN - 1742-6596
pISSN - 1742-6588
DOI - 10.1088/1742-6596/1871/1/012063
Subject(s) - computer science , regularization (linguistics) , redundancy (engineering) , pesq , speech recognition , filter (signal processing) , noise reduction , speech enhancement , encoder , convolution (computer science) , artificial intelligence , artificial neural network , computer vision , operating system
The speech denoising model based on adversarial generative network has achieved better results than the traditional machine learning model. In this paper, for the short cut connection in the generator, we discuss its influence on the information transfer between encoder and decoder, and propose SDGAN at target. SDGAN sets linear and convolution filters in the short cut connection which adaptively learn the optimal information processing. The information filter still enables the generator to solve the gradient vanishing problem, and it can also avoid information redundancy and improve expression ability. In addition, SDGAN replaces the L1 regularization term in loss function with the L2 regularization term, which not only makes the output speech of the generator closer to the clean speech, but also avoids sparsity. In the experiments, SDGAN significantly performs better than other traditional GAN in five performance metrics (such as PESQ), and the effect of convolution filter is better than that of linear filter.