Multichannel Speech Enhancement in Vehicle Environment Based on Interchannel Attention Mechanism
Author(s) -
Xueli Shen,
Zhenxing Liang,
Shiyin Li,
Yanji Jiang
Publication year - 2021
Publication title -
journal of advanced transportation
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.577
H-Index - 46
eISSN - 2042-3195
pISSN - 0197-6729
DOI - 10.1155/2021/9453911
Subject(s) - pesq , computer science , speech enhancement , frame (networking) , artificial intelligence , feature extraction , feature (linguistics) , speech recognition , cockpit , waveform , attention network , noise (video) , signal (programming language) , pattern recognition (psychology) , computer vision , noise reduction , engineering , radar , telecommunications , image (mathematics) , linguistics , philosophy , aerospace engineering , programming language
Speech enhancement in a vehicle environment remains a challenging task for the complex noise. The paper presents a feature extraction method that we use interchannel attention mechanism frame by frame for learning spatial features directly from the multichannel speech waveforms. The spatial features of the individual signals learned through the proposed method are provided as an input so that the two-stage BiLSTM network is trained to perform adaptive spatial filtering as time-domain filters spanning signal channels. The two-stage BiLSTM network is capable of local and global features extracting and reaches competitive results. Using scenarios and data based on car cockpit simulations, in contrast to other methods that extract the feature from multichannel data, the results show the proposed method has a significant performance in terms of all SDR, SI-SNR, PESQ, and STOI.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom