z-logo
open-access-imgOpen Access
Applying DNN Adaptation to Reduce the Session Dependency of Ultrasound Tongue Imaging-based Silent Speech Interfaces
Author(s) -
Gábor Gosztolya,
Tamás Grósz,
László Tóth,
Alexandra Markó,
Tamás Gábor Csapó
Publication year - 2020
Publication title -
acta polytechnica hungarica
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.277
H-Index - 34
eISSN - 2064-2687
pISSN - 1785-8860
DOI - 10.12700/aph.17.7.2020.7.6
Subject(s) - session (web analytics) , computer science , dependency (uml) , adaptation (eye) , speech recognition , tongue , artificial intelligence , psychology , medicine , neuroscience , world wide web , pathology
Silent Speech Interfaces (SSI) perform articulatory-to-acoustic mapping to convert articulatory movement into synthesized speech. Its main goal is to aid the speech handicapped, or to be used as a part of a communication system operating in silencerequired environments or in those with high background noise. Although many previous studies addressed the speaker-dependency of SSI models, session-dependency is also an important issue due to the possible misalignment of the recording equipment. In particular, there are currently no solutions available, in the case of tongue ultrasound recordings. In this study, we investigate the degree of session-dependency of standard feed-forward DNNbased models for ultrasound-based SSI systems. Besides examining the amount of training data required for speech synthesis parameter estimation, we also show that DNN adaptation can be useful for handling session dependency. Our results indicate that by using adaptation, less training data and training time are needed to achieve the same speech quality over training a new DNN from scratch. Our experiments also suggest that the sub-optimal cross-session behavior is caused by the misalignment of the recording equipment, as adapting just the lower, feature extractor layers of the neural network proved to be sufficient, in achieving a comparative level of performance. G. Gosztolya et al. Applying DNN Adaptation to Reduce the Session Dependency of Ultrasound Tongue Imaging-based Silent Speech Interfaces – 110 –

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom