z-logo
open-access-imgOpen Access
Performance improvement of monaural speech separation system using image analysis techniques
Author(s) -
Sivapatham Shoba,
Ramadoss Rajavel
Publication year - 2018
Publication title -
iet signal processing
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.384
H-Index - 42
ISSN - 1751-9683
DOI - 10.1049/iet-spr.2017.0375
Subject(s) - intelligibility (philosophy) , computer science , monaural , speech recognition , pixel , pattern recognition (psychology) , artificial intelligence , philosophy , epistemology
This research work proposes an image analysis‐based algorithm to enhance the time–frequency ( T – F ) mask obtained in the initial segmentation of CASA‐based monaural speech separation system to improve speech quality and intelligibility. It consists of labelling the initial segmentation mask, boundary extraction, active pixel detection and eliminating the non‐active pixels related to noise. In labelling, the T – F mask obtained is labelled as periodicity pixel ( P ) matrix and non‐periodicity pixel ( NP ) matrix. Next speech boundaries are created by connecting all the possible nearby P and NP matrix. Some speech boundary may include noisy T – F units as holes; these holes are treated using the proposed algorithm. The proposed algorithm is evaluated with the quality and intelligibility measures such as signal to noise ratio (SNR), perceptual evaluation of speech quality, P EL, P NR, coherence speech intelligibility index (CSII), normalised covariance metric (NCM), and short‐time objective intelligibility (STOI). The experimental results show that the proposed algorithm improves the speech quality by increasing the SNR with an average value of 9.91 dB and reduces the P NRby an average value of 25.6% and also improves the speech intelligibility in terms of CSII, NCM, and STOI when compared with the input noisy speech mixture

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here