Lithuanian Broadcast Speech Transcription Using Semi-supervised Acoustic Model Training | Zendy

Rasa Lileikytė | Zendy; Arseniy Gorin | Zendy; Lori Lamel | Zendy; JeanLuc Gauvain | Zendy; Thiago Fraga-Silva | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Lithuanian Broadcast Speech Transcription Using Semi-supervised Acoustic Model Training

Author(s) -

Rasa Lileikytė,

Arseniy Gorin,

Lori Lamel,

JeanLuc Gauvain,

Thiago Fraga-Silva

Publication year - 2016

Publication title -

procedia computer science

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.334

H-Index - 76

ISSN - 1877-0509

DOI - 10.1016/j.procs.2016.04.037

Subject(s) - computer science , pronunciation , discriminative model , speech recognition , word error rate , acoustic model , transcription (linguistics) , artificial intelligence , natural language processing , speech processing , philosophy , linguistics

This paper reports on an experimental work to build a speech transcription system for Lithuanian broadcast data, relying on unsupervised and semi-supervised training methods as well as on other low-knowledge methods to compensate for missing resources. Unsupervised acoustic model training is investigated using 360hours of untranscribed speech data. A graphemic pronunciation approach is used to simplify the pronunciation model generation and there-fore ease the language model adaptation for the system users. Discriminative training on top of semi-supervised training is also investigated, as well as various types of acoustic features and their combinations. Experimental results are provided for each of our development steps as well as contrastive results comparing various options. Using the best system configuration a word error rate of 18.3% is obtained on a set of development data from the Quaero program

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research