
Sparse representation‐based quasi‐clean speech construction for speech quality assessment under complex environments
Author(s) -
Zhou Weili,
He Qianhua,
Wang Yalou,
Li Yanxiong
Publication year - 2017
Publication title -
iet signal processing
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.384
H-Index - 42
ISSN - 1751-9683
DOI - 10.1049/iet-spr.2016.0555
Subject(s) - pesq , computer science , speech recognition , sparse approximation , speech enhancement , matching pursuit , speech coding , mean opinion score , voice activity detection , intelligibility (philosophy) , speech processing , pattern recognition (psychology) , artificial intelligence , noise reduction , metric (unit) , philosophy , operations management , epistemology , compressed sensing , economics
A non‐intrusive speech quality assessment method for complex environments was proposed. In the proposed approach, a new sparse representation‐based speech reconstruction algorithm was presented to acquire the quasi‐clean speech from the noisy degraded signal. Firstly, an over‐complete dictionary of the clean speech power spectrum was learned by the K‐singular value decomposition algorithm. Then in the sparse representation stage, the stopping residue error was adaptively achieved according to the estimated cross‐correlation and the noise spectrum which was adjusted by a posteriori SNR‐weighted factor, and the orthogonal matching pursuit approach was applied to reconstruct the clean speech spectrum from the noisy speech. The quasi‐clean speech was considered as the reference to a modified PESQ perceptual model, and the mean opinion score of the noisy degraded speech was achieved via the distortions estimation between the quasi‐clean speech and the degraded speech. Experimental results show that the proposed approach obtains a correlation coefficient of 0.925 on NOIZEUS complex environment database, which is 99% similar to the performance of the intrusive standard ITU‐T PESQ, and 7.1% outperforms non‐intrusive standard ITU‐T P.563.