DECISION-TREE BASED ANALYSIS OF SPEAKING MODE DISCREPANCIES IN EMG-BASED SPEECH RECOGNITION
Author(s) -
Michael Wand,
Matthias Janke,
Tanja Schultz
Publication year - 2012
Language(s) - English
Resource type - Conference proceedings
DOI - 10.5220/0003787201010109
Subject(s) - speech recognition , decision tree , computer science , mode (computer interface) , speaker recognition , natural language processing , tree (set theory) , pattern recognition (psychology) , artificial intelligence , mathematics , human–computer interaction , mathematical analysis
This study is concerned with the impact of speaking mode variabilities on speech recognition by surface electromyography (EMG). In EMG-based speech recognition, we capture the electric potentials of the human articulatory muscles by surface electrodes, so that the resulting signal can be used for speech processing. This enables the user to communicate silently, without uttering any sound. Previous studies have shown that the processing of silent speech creates a new challenge, namely that EMG signals of audible and silent speech are quite distinct. In this study we consider EMG signals of three speaking modes: audibly spoken speech, whispered speech, and silently mouthed speech. We present an approach to quantify the differences between these speaking modes by means of phonetic decision trees and show that this measure correlates highly with differences in the performance of a recognizer on the different speaking modes. We furthermore reinvestigate the spectral mapping algorithm, which reduces the discrepancy between different speaking modes, and give an evaluation of its effectiveness.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom