Exploring conventional enhancement and separation methods for multi‐speech enhancement in indoor environments | Zendy

Wei Yangjie | Zendy; Zhang Ke | Zendy; Wu Dan | Zendy; Hu Zhongqi | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Exploring conventional enhancement and separation methods for multi‐speech enhancement in indoor environments

Author(s) -

Wei Yangjie,

Zhang Ke,

Wu Dan,

Hu Zhongqi

Publication year - 2021

Publication title -

cognitive computation and systems

Language(s) - English

Resource type - Journals

ISSN - 2517-7567

DOI - 10.1049/ccs2.12023

Subject(s) - speech enhancement , computer science , performance enhancement , beamforming , preprocessor , independent component analysis , speech recognition , minimum variance unbiased estimator , diagonal , artificial intelligence , noise reduction , mathematics , mean squared error , telecommunications , statistics , medicine , geometry , physical medicine and rehabilitation

Speech enhancement is an important preprocessing step in a wide diversity of practical fields related to speech signals, and many signal‐processing methods have already been proposed for speech enhancement. However, the lack of a comprehensive and quantitative evaluation of enhancement performance for multi‐speech makes it difficult to choose an appropriate enhancement method for a multi‐speech application. This work aims to study the implementation of several enhancement methods for multi‐speech enhancement in indoor environments of T60 = 0 s and T60 = 0.3 s. Two types of enhancement approaches are proposed and compared. The first type is the basic enhancement methods, including delay‐and‐sum beamforming (DSB), minimum variance distortionless response (MVDR), linearly constrained minimum variance (LCMV), and independent component analysis (ICA). The second type is the robust enhancement methods, including improved MVDR and LCMV realized by eigendecomposition and diagonal loading. In addition, online enhancement performance based on the iteration of single‐frame speech signals is researched, as is the comprehensive performance of various enhancement methods. The experimental results show that the enhancement effects of LCMV and ICA are relatively more stable in the case of basic enhancement methods; in the case of the improved enhancement algorithms, methods that employ diagonal loading iterations show better performance. In terms of online enhancement, DSB with frequency masking (FM) yields the best performance on the signal‐to‐interference ratio (SIR) and can suppress interference. The comprehensive performance test showed that LCMV and ICA yielded the best effects when there was no reverberation, while DSB with FM yielded the best SIR value when reverberation was present.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research