z-logo
open-access-imgOpen Access
Using multimodal speech production data to evaluate articulatory animation for audiovisual speech synthesis
Author(s) -
Ingmar Steiner,
Korin Richmond,
Slim Ouni
Publication year - 2012
Publication title -
hal (le centre pour la communication scientifique directe)
Language(s) - English
Resource type - Conference proceedings
DOI - 10.1145/2491599.2491601
Subject(s) - viseme , computer science , animation , articulation (sociology) , morphing , computer facial animation , speech recognition , computer animation , speech synthesis , speech production , quality (philosophy) , production (economics) , artificial intelligence , speech technology , computer graphics (images) , philosophy , epistemology , politics , political science , law , economics , macroeconomics
The importance of modeling speech articulation for high-quality audiovisual (AV) speech synthesis is widely acknowledged. Nevertheless, while state-of-the-art, data-driven approaches to facial animation can make use of sophisticated motion capture techniques, the animation of the intraoral articulators (viz. the tongue, jaw, and velum) typically makes use of simple rules or viseme morphing, in stark contrast to the otherwise high quality of facial modeling. Using appropriate speech production data could significantly improve the quality of articulatory animation for AV synthesis.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom