z-logo
open-access-imgOpen Access
Non‐intrusive speech quality assessment using multi‐resolution auditory model features for degraded narrowband speech
Author(s) -
Dubey Rajesh Kumar,
Kumar Arun
Publication year - 2015
Publication title -
iet signal processing
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.384
H-Index - 42
ISSN - 1751-9683
DOI - 10.1049/iet-spr.2014.0214
Subject(s) - computer science , speech recognition , mel frequency cepstrum , feature (linguistics) , mixture model , narrowband , feature vector , pattern recognition (psychology) , artificial intelligence , feature extraction , telecommunications , philosophy , linguistics
A multi‐resolution framework using auditory perception‐based wavelet packet transform is invoked in multi‐resolution auditory model (MRAM) and used for non‐intrusive objective speech quality estimation. The MRAM provides a detailed time‐frequency modelling of the human auditory system compared to earlier models that have been used for non‐intrusive speech quality estimation. The objective Mean Opinion Score (MOS) of a degraded narrowband speech utterance has been estimated by Gaussian Mixture Model (GMM) probabilistic approach using MRAM‐based feature vector. Additionally, a recent auditory model (Lyons’ auditory model) based features, mel‐frequency cepstral coefficients (MFCC), and line spectral frequencies (LSF) features have also been used independently for comparison of the performance of MRAM features. The combination of MFCC and LSF features with MRAM features for non‐intrusive speech quality estimation using GMM probabilistic approach has been proposed and investigated. The performance of these feature vectors has been evaluated and compared with ITU‐T Recommendation P.563 and a recent published work by computing correlation coefficient and root‐mean‐square error between the subjective MOS and the estimated objective MOS. It is found that the proposed method that uses a combination of MRAM features, MFCC, and LSF feature vectors for non‐intrusive speech quality performs better than both the other algorithms.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here