Using multimodal speech production data to evaluate articulatory animation for audiovisual speech synthesis | Zendy

Ingmar Steiner | Zendy; Korin Richmond | Zendy; Slim Ouni | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Using multimodal speech production data to evaluate articulatory animation for audiovisual speech synthesis

Author(s) -

Ingmar Steiner,

Korin Richmond,

Slim Ouni

Publication year - 2012

Publication title -

hal (le centre pour la communication scientifique directe)

Language(s) - English

Resource type - Conference proceedings

DOI - 10.1145/2491599.2491601

Subject(s) - viseme , computer science , animation , articulation (sociology) , morphing , computer facial animation , speech recognition , computer animation , speech synthesis , speech production , quality (philosophy) , production (economics) , artificial intelligence , speech technology , computer graphics (images) , philosophy , epistemology , politics , political science , law , economics , macroeconomics

The importance of modeling speech articulation for high-quality audiovisual (AV) speech synthesis is widely acknowledged. Nevertheless, while state-of-the-art, data-driven approaches to facial animation can make use of sophisticated motion capture techniques, the animation of the intraoral articulators (viz. the tongue, jaw, and velum) typically makes use of simple rules or viseme morphing, in stark contrast to the otherwise high quality of facial modeling. Using appropriate speech production data could significantly improve the quality of articulatory animation for AV synthesis.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research