z-logo
open-access-imgOpen Access
Joint Measurement of Multi-channel Sound Event Detection and Localization Using Deep Neural Network
Author(s) -
Yuting Zhou,
Hongjie Wan
Publication year - 2022
Publication title -
journal of physics. conference series
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.21
H-Index - 85
eISSN - 1742-6596
pISSN - 1742-6588
DOI - 10.1088/1742-6596/2216/1/012101
Subject(s) - spectrogram , computer science , channel (broadcasting) , feature (linguistics) , speech recognition , event (particle physics) , joint (building) , feature extraction , pattern recognition (psychology) , microphone , sound (geography) , artificial intelligence , recurrent neural network , artificial neural network , acoustics , telecommunications , engineering , architectural engineering , philosophy , linguistics , physics , sound pressure , quantum mechanics
For joint sound event localization and detection (SELD), a multi-channel sound event method based on deep learning is proposed. This paper uses CRNN model training with datasets of maximum two overlapping sound events. The difficulty of the polyphonic SELD is the combination of SED and DOA estimation in the same network. Using multi-channel audio can better identify these overlapping sound events. The input of the proposed model is a series of continuous spectrograms, which are then output to two branches respectively. As the first branch, SED performs multi-label classification in each time segment. As the second branch, 3-D Cartesian coordinates are used to represent the DOA estimate of each sound event. This paper extracts the phase feature and amplitude feature of the sound spectrum from each audio channel, avoiding feature extraction limited by other microphone arrays.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here