Performance improvement of monaural speech separation system using image analysis techniques | Zendy

Sivapatham Shoba | Zendy; Ramadoss Rajavel | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Performance improvement of monaural speech separation system using image analysis techniques

Author(s) -

Sivapatham Shoba,

Ramadoss Rajavel

Publication year - 2018

Publication title -

iet signal processing

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.384

H-Index - 42

ISSN - 1751-9683

DOI - 10.1049/iet-spr.2017.0375

Subject(s) - intelligibility (philosophy) , computer science , monaural , speech recognition , pixel , pattern recognition (psychology) , artificial intelligence , philosophy , epistemology

This research work proposes an image analysis‐based algorithm to enhance the time–frequency ( T – F ) mask obtained in the initial segmentation of CASA‐based monaural speech separation system to improve speech quality and intelligibility. It consists of labelling the initial segmentation mask, boundary extraction, active pixel detection and eliminating the non‐active pixels related to noise. In labelling, the T – F mask obtained is labelled as periodicity pixel ( P ) matrix and non‐periodicity pixel ( NP ) matrix. Next speech boundaries are created by connecting all the possible nearby P and NP matrix. Some speech boundary may include noisy T – F units as holes; these holes are treated using the proposed algorithm. The proposed algorithm is evaluated with the quality and intelligibility measures such as signal to noise ratio (SNR), perceptual evaluation of speech quality, P EL, P NR, coherence speech intelligibility index (CSII), normalised covariance metric (NCM), and short‐time objective intelligibility (STOI). The experimental results show that the proposed algorithm improves the speech quality by increasing the SNR with an average value of 9.91 dB and reduces the P NRby an average value of 25.6% and also improves the speech intelligibility in terms of CSII, NCM, and STOI when compared with the input noisy speech mixture

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore