
Gated Time Delay Neural Network for Speech Recognition
Author(s) -
Kaibin Chen,
Weibin Zhang,
Dongpeng Chen,
Xiaorong Huang,
Bo-Ji Liu,
Xiangmin Xu
Publication year - 2019
Publication title -
journal of physics. conference series
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.21
H-Index - 85
eISSN - 1742-6596
pISSN - 1742-6588
DOI - 10.1088/1742-6596/1229/1/012077
Subject(s) - computer science , artificial neural network , speech recognition , focus (optics) , context (archaeology) , mechanism (biology) , artificial intelligence , signal (programming language) , time delay neural network , paleontology , philosophy , physics , epistemology , optics , biology , programming language
In deep neural networks, the gate mechanism is a very effective tool for controlling the information flow. For example, the gates of Long Short-Term Memory (LSTM) help alleviate the gradient vanishing problem. In addition, these gates preserve useful information. We believe that it will benefit if the system learns to explicitly focus on the relevant dimensions of the input. In this paper, we propose Gated Time Delay Neural Networks (Gated TDNN) for speech recognition. Time-delay layers are utilized to model the long temporal context correlation of speech signal while the gate mechanism enables the model to discover the relevant dimensions of the input. Our experimental results on the Switchboard and the Librispeech data sets demonstrate the effectiveness of the proposed method.