Weakly and semi-supervised learning for sound event detection using image pretrained convolutional recurrent neural network, weighted pooling and mean teacher method | Zendy

Xichang Cai | Zendy; Dongchi Yu | Zendy; Du-Xin Liu | Zendy; Menglong Wu | Zendy

Open Access

Weakly and semi-supervised learning for sound event detection using image pretrained convolutional recurrent neural network, weighted pooling and mean teacher method

Author(s) -

Xichang Cai,

Dongchi Yu,

Du-Xin Liu,

Menglong Wu

Publication year - 2021

Publication title -

journal of physics conference series

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.21

H-Index - 85

eISSN - 1742-6596

pISSN - 1742-6588

DOI - 10.1088/1742-6596/2010/1/012108

Subject(s) - computer science , pooling , artificial intelligence , convolutional neural network , pattern recognition (psychology) , artificial neural network , image (mathematics) , deep learning , focus (optics) , event (particle physics) , transfer of learning , speech recognition , machine learning , physics , quantum mechanics , optics

In this paper, we propose a sound event detection (SED) method which uses deep neural network trained on weak labeled and unlabeled data. The proposed method utilizes a convolutional recurrent neural network (CRNN) to extract high level features of audio clips. Inspired by the impressive performance of transfer learning in the field of image recognition, the convolutional neural network (CNN) in the proposed CRNN is an image-pretrained model. Although there is a significant difference between audio and image, the image-pretrained CNN still has competitive performance in SED and can effectively reduce the amount of training data needed. To learn from weak labeled data, the proposed method utilizes a weighted pooling strategy which enables the network to focus on the frames containing events in an audio clip. For unlabeled data, the proposed method utilizes a mean teacher semi-supervised learning method and data augmentation technique. To demonstrate the performance of the proposed method, we conduct the experimental evaluation using the DCASE2021 Task4 dataset. The experimental results demonstrate that the proposed method outperforms the DCASE2021 Task4 baseline method.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research