Frame-Wise Dynamic Threshold Based Polyphonic Acoustic Event Detection
Author(s) -
Xianjun Xia,
Roberto Togneri,
Ferdous Sohel,
Defeng Huang
Publication year - 2017
Publication title -
interspeech 2022
Language(s) - English
Resource type - Conference proceedings
DOI - 10.21437/interspeech.2017-746
Subject(s) - polyphony , computer science , event (particle physics) , frame (networking) , set (abstract data type) , speech recognition , hidden markov model , pattern recognition (psychology) , artificial intelligence , acoustics , telecommunications , physics , quantum mechanics , programming language
Acoustic event detection, the determination of the acoustic event type and the localisation of the event, has been widely applied in many real-world applications. Many works adopt multi-label classification techniques to perform the polyphonic acoustic event detection with a global threshold to detect the active acoustic events. However, the global threshold has to be set manually and is highly dependent on the database being tested. To deal with this, we replaced the fixed threshold method with a frame-wise dynamic threshold approach in this paper. Two novel approaches, namely contour and regressor based dynamic threshold approaches are proposed in this work. Experimental results on the popular TUT Acoustic Scenes 2016 database of polyphonic events demonstrated the superior performance of the proposed approaches.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom