A Matlab Tool For Speech Processing, Analysis And Recognition: Sar Lab
Author(s) -
Veton Këpuska,
Mihir Patal,
Nicholas Rogers
Publication year - 2020
Publication title -
2006 annual conference & exposition proceedings
Language(s) - English
Resource type - Conference proceedings
DOI - 10.18260/1-2--263
Subject(s) - computer science , speech recognition , speaker recognition , voice activity detection , audio mining , speech processing , matlab , signal processing , feature extraction , acoustic model , noise (video) , key (lock) , artificial intelligence , digital signal processing , computer hardware , computer security , image (mathematics) , operating system
Presented work is related to research performed in developing a “smart-room.” A smart-room can sense all voice activity within the room and pinpoint the source of the audio signal (speaker). The purpose of this audio sensing is two-fold: to monitor for key words, sentences, or phrases that are flagged as relevant by the monitoring entity as well as separation of all acoustic sources from each other (e.g., background noise from the speakers voice) Crucial requirement in successfully creating such a smart-room is the accurate (in terms of recognition performance), efficient (CPU and memory), and consistent recognition of speech (must work equally well for all speakers; i.e., speaker independent, as well as all acoustic environments). To achieve this goal it becomes necessary to develop tools that enable for advanced research in the area of speech processing, analysis and recognition, specifically in this case wake-up-word i [WUW] recognition. In developing such a system numerous tests of various system models are necessary. Modules ranging from audio signal processing functions and feature extraction, voice activity detection, pattern classification, scoring algorithms, etc., must be combined in order to perform speech recognition. Thus, a major hurdle in this area of research is the analysis, testing, verification, and integration of the individual functions required for speech recognition. To address the analysis and testing issue an appropriate software tool is developed using MATLAB environment that enabled unified framework for tracking the performance of all necessary functions of WUW recognition system. This framework can also be used for testing algorithms and other software components performing speech analysis and recognition tasks. In addition to integrating all of the various components, testing environment can produce additional analysis data all appropriately presented as graphs, charts or images (e.g., spectrogram) that are useful when analyzing and/or troubleshooting such components that are under research. This testing environment has proven to be very useful in aiding research in development of “wake-up word” recognition technology. This tool thus has made research process much more efficient, accurate, and productive.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom