Towards End-to-End Spoken Dialogue Systems with Turn Embeddings
Author(s) -
Ali Orkan Bayer,
Evgeny A. Stepanov,
Giuseppe Riccardi
Publication year - 2017
Publication title -
interspeech 2022
Language(s) - English
Resource type - Conference proceedings
DOI - 10.21437/interspeech.2017-1574
Subject(s) - end to end principle , computer science , turn taking , turn (biochemistry) , end user development , end user , artificial intelligence , world wide web , communication , conversation , psychology , biochemistry , chemistry
Training task-oriented dialogue systems requires significant amount of manual effort and integration of many independently built components; moreover, the pipeline is prone to errorpropagation. End-to-end training has been proposed to overcome these problems by training the whole system over the utterances of both dialogue parties. In this paper we present an end-to-end spoken dialogue system architecture that is based on turn embeddings. Turn embeddings encode a robust representation of user turns with a local dialogue history and they are trained using sequence-to-sequence models. Turn embeddings are trained by generating the previous and the next turns of the dialogue and additionally perform spoken language understanding. The end-to-end spoken dialogue system is trained using the pre-trained turn embeddings in a stateful architecture that considers the whole dialogue history. We observe that the proposed spoken dialogue system architecture outperforms the models based on local-only dialogue history and it is robust to automatic speech recognition errors.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom