
Blind source separation‐based IVA‐Xception model for bird sound recognition in complex acoustic environments
Author(s) -
Dai Yusheng,
Yang Jin,
Dong Yiwei,
Zou Haipeng,
Hu Mingzhi,
Wang Bin
Publication year - 2021
Publication title -
electronics letters
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.375
H-Index - 146
eISSN - 1350-911X
pISSN - 0013-5194
DOI - 10.1049/ell2.12160
Subject(s) - computer science , source separation , artificial intelligence , pattern recognition (psychology) , noise (video) , focus (optics) , feature extraction , convolutional neural network , speech recognition , sound (geography) , acoustics , physics , image (mathematics) , optics
Identification of bird species from audio recordings has been a major area of interest within the field of ecological surveillance and biodiversity conservation. Previous studies have successfully identified bird species from given recordings. However, most of these studies are only adaptive to low‐noise acoustic environments and the cases where each recording contains only one bird's sound simultaneously. In reality, bird audios recorded in the wild often contain overlapping signals, such as bird dawn chorus, which makes audio feature extraction and accurate classification extremely difficult. This study is the first to focus on applying a blind source separation method to identify all foreground bird species contained in overlapping vocalization recordings. The proposed IVA‐Xception model is based on independent vector analysis and convolutional neural network. Experiments on 2020 Bird Sound Recognition in Complex Acoustic Environments competition (BirdCLEF2020) dataset show that this model could achieve a higher macro F1‐score and average accuracy compared with state‐of‐the‐art methods.