Towards End-to-End Spoken Dialogue Systems with Turn Embeddings | Zendy

Ali Orkan Bayer | Zendy; Evgeny A. Stepanov | Zendy; Giuseppe Riccardi | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Towards End-to-End Spoken Dialogue Systems with Turn Embeddings

Author(s) -

Ali Orkan Bayer,

Evgeny A. Stepanov,

Giuseppe Riccardi

Publication year - 2017

Publication title -

interspeech 2022

Language(s) - English

Resource type - Conference proceedings

DOI - 10.21437/interspeech.2017-1574

Subject(s) - end to end principle , computer science , turn taking , turn (biochemistry) , end user development , end user , artificial intelligence , world wide web , communication , conversation , psychology , biochemistry , chemistry

Training task-oriented dialogue systems requires significant amount of manual effort and integration of many independently built components; moreover, the pipeline is prone to errorpropagation. End-to-end training has been proposed to overcome these problems by training the whole system over the utterances of both dialogue parties. In this paper we present an end-to-end spoken dialogue system architecture that is based on turn embeddings. Turn embeddings encode a robust representation of user turns with a local dialogue history and they are trained using sequence-to-sequence models. Turn embeddings are trained by generating the previous and the next turns of the dialogue and additionally perform spoken language understanding. The end-to-end spoken dialogue system is trained using the pre-trained turn embeddings in a stateful architecture that considers the whole dialogue history. We observe that the proposed spoken dialogue system architecture outperforms the models based on local-only dialogue history and it is robust to automatic speech recognition errors.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research