z-logo
open-access-imgOpen Access
Phoneme-to-Articulatory Mapping Using Bidirectional Gated RNN
Author(s) -
Théo Biasutto-Lervat,
Slim Ouni
Publication year - 2018
Publication title -
interspeech 2022
Language(s) - English
Resource type - Conference proceedings
DOI - 10.21437/interspeech.2018-1202
Subject(s) - coarticulation , computer science , speech recognition , recurrent neural network , inversion (geology) , speech production , artificial neural network , acoustic model , signal (programming language) , artificial intelligence , speech processing , vowel , paleontology , structural basin , biology , programming language
Deriving articulatory dynamics from the acoustic speech signal has been addressed in several speech production studies. In this paper, we investigate whether it is possible to predict articulatory dynamics from phonetic information without having the acoustic speech signal. The input data may be considered as not sufficiently rich acoustically, as probably there is no explicit coarticulation information but we expect that the phonetic sequence provides compact yet rich knowledge. Motivated by the recent success of deep learning techniques used in the acoustic-to-articulatory inversion, we have experimented around the bidirectional gated recurrent neural network archi-tectures. We trained these models with an EMA corpus, and have obtained good performances similar to the state-of-the-art articulatory inversion from LSF features, but using only the phoneme labels and durations.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom