Extraction of Speech Parameters from Speech Database using Festival | Zendy

N. Sangramsing | Zendy; Monica R. Mundada | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Extraction of Speech Parameters from Speech Database using Festival

Author(s) -

N. Sangramsing,

Monica R. Mundada

Publication year - 2016

Publication title -

international journal of computer applications

Language(s) - English

Resource type - Journals

ISSN - 0975-8887

DOI - 10.5120/ijca2016907882

Subject(s) - computer science , speech recognition , extraction (chemistry) , natural language processing , artificial intelligence , chemistry , chromatography

Speech synthesis is the process of production of artificial speech. The system used for generation of speech from text is called as text-to-speech (TTS) system. In TTS system, text and voice models for a particular language or multiple languages are given as input to the system, which generates speech as output corresponding to the provided voice models. Speech synthesis systems can be extremely useful to people who are visually challenged, visually impaired and illiterate to get into the mainstream society. More recent applications include spoken dialogue systems and communicative robots. HMM (Hidden Markov Model) based Speech synthesis is the emerging technology for TTS. HMM based speech synthesis system consists of training phase and synthesis phase. In the training part, phone and excitation parameters are extracted from speech database and modeled by context dependent HMMs. In synthesis part, the system will extract the suitable phone and excitation parameters from the previously trained models and generates the speech. The main objective of this project is to build an HMM based speech synthesis system. In the training process, the system uses HTK (Hidden Markov Model Tool Kit) and SPTK (Signal Processing Tool Kit) developed at Cambridge University and Tokyo Institute of Technology respectively. Synthesis part is be done by „Festival‟. Festival is a speech synthesis tool for the generation of speech and it is language independent which is developed at the University of Edinburgh. The main advantage of this approach is its flexibility in changing speaker identities, emotions and speaking styles.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research