z-logo
open-access-imgOpen Access
Self-Supervised Pre-Trained Speech Representation Based End-to-End Mispronunciation Detection and Diagnosis of Mandarin
Author(s) -
Yunfei Shen,
Qingqing Liu,
Zhixing Fan,
Jiajun Liu,
Aishan Wumaier
Publication year - 2022
Publication title -
ieee access
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.587
H-Index - 127
ISSN - 2169-3536
DOI - 10.1109/access.2022.3212417
Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation
Mispronunciation Detection and Diagnosis (MDD) is an essential basic technology in Computer-Assisted Pronunciation Training (CAPT) and Computer-Assisted Language Learning (CALL). MDD research in Mandarin is faced with the problem of lack of relevant data, which is a typical low-resource scenario. In recent years, self-supervised pre-trained speech representation has developed rapidly and achieved significant performance improvement in low-resource speech recognition scenarios, making it necessary to be applied to MDD tasks. First, we build a Mandarin MDD dataset called PSC-Reading for the Putonghua Proficiency Test (PSC) passage reading section. Then we extended the end-to-end MDD system based on CTC/Attention hybrid architecture and Transformer architecture, using features extracted from self-supervised pre-training speech representation models such as Wav2Vec 2.0 and WavLM to replace conventional speech features like MFCC and Fbank, and conduct experiments on the PSC-Reading dataset. Experimental results show that, compared with the baseline model CNN-RNN-CTC, our WavLM-based model obtains 20.5% relative improvement on the F1 score metric.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here