Sibilant consonants classification comparison with multi‐ and single‐class neural networks | Zendy

Anjos Ivo | Zendy; Marques Nuno | Zendy; Grilo Margarida | Zendy; Guimarães Isabel | Zendy; Magalhães João | Zendy; Cavaco Sofia | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Premium

Sibilant consonants classification comparison with multi‐ and single‐class neural networks

Author(s) -

Anjos Ivo,

Marques Nuno,

Grilo Margarida,

Guimarães Isabel,

Magalhães João,

Cavaco Sofia

Publication year - 2020

Publication title -

expert systems

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.365

H-Index - 38

eISSN - 1468-0394

pISSN - 0266-4720

DOI - 10.1111/exsy.12620

Subject(s) - computer science , convolutional neural network , speech recognition , classifier (uml) , binary classification , artificial intelligence , artificial neural network , deep learning , pattern recognition (psychology) , support vector machine

Many children with speech sound disorders cannot pronounce the sibilant consonants correctly. We have developed a serious game, which is controlled by the children's voices in real time, with the purpose of helping children on practicing the production of European Portuguese (EP) sibilant consonants. For this, the game uses a sibilant consonant classifier. Since the game does not require any type of adult supervision, children can practice producing these sounds more often, which may lead to faster improvements of their speech. Recently, the use of deep neural networks has given considerable improvements in the classification of a variety of use cases, from image classification to speech and language processing. Here, we propose to use deep convolutional neural networks to classify sibilant phonemes of EP in our serious game for speech and language therapy. We compared the performance of several different artificial neural networks that used Mel frequency cepstral coefficients or log Mel filterbanks. Our best deep learning model achieves classification scores of 95.48% using a 2D convolutional model with log Mel filterbanks as input features. Such results are then further improved for specific classes with simple binary classifiers.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here

Accelerating Research