z-logo
open-access-imgOpen Access
Optimal state duration assignment in hidden Markov model‐based text‐to‐speech synthesis system
Author(s) -
Khan Najeeb Ullah,
Lee JungChul
Publication year - 2015
Publication title -
electronics letters
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.375
H-Index - 146
eISSN - 1350-911X
pISSN - 0013-5194
DOI - 10.1049/el.2015.0539
Subject(s) - duration (music) , state (computer science) , constraint (computer aided design) , hidden markov model , sequence (biology) , speech recognition , computer science , speech synthesis , mathematics , algorithm , acoustics , physics , geometry , biology , genetics
In state‐of‐the‐art text‐to‐speech (TTS) systems the state durations for each phoneme are generated so as to maximise the state sequence probability given the constraint that the sum of all state durations should be equal to the phoneme duration. Such maximisation sometimes results in negative state durations when the specified phoneme duration is less than the sum of the means of all the states of the phoneme. Such discrepancy implicitly results in the violation of the equality constraint. This has implications for speech research problems, in which each phoneme duration is specified. One such problem is the use of the TTS synthesis system for singing voice synthesis research. An algorithm for state duration assignment is derived so as to maximise the probability of the state sequence with the constraints that the sum of state durations should be equal to the total duration of the phoneme and all the state durations must be greater than or equal to 1. Experimental results show that the proposed algorithm always produces state durations greater than or equal to 1 while satisfying the equality constraint.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here