
Lower resources of spoken language understanding from voice to semantics
Author(s) -
Hao Zhang,
LV Cheng Guo
Publication year - 2020
Publication title -
journal of physics. conference series
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.21
H-Index - 85
eISSN - 1742-6596
pISSN - 1742-6588
DOI - 10.1088/1742-6596/1486/5/052033
Subject(s) - computer science , spoken language , pipeline (software) , comprehension , natural language processing , semantics (computer science) , speech recognition , domain (mathematical analysis) , artificial intelligence , natural language , mathematical analysis , mathematics , programming language
Spoken language understanding is traditionally designed as a pipeline consisting of multiple components. First, the speech signal is mapped into text through the automatic speech recognition module, and then the natural language understanding module converts the recognized text into structured data, such as domain, intention and slot value. Usually these modules are trained separately. End-to-end speech comprehension, on the other hand, derives structured data directly from speech through a single model. However, end-to-end spoken language understanding based on a large amount of training data is difficult to achieve in different fields and different groups of people. For this reason, we introduced end-to-end oral comprehension based on pre-training with low resources and combined it with capsule vector. The experimental results show that the oral comprehension of this model with low resources is robust under different data sets.