Comparison of Speech Features on the Speech Recognition Task
Author(s) -
Iosif Mporas,
Todor Ganchev,
Mihalis Siafarikas
Publication year - 2007
Publication title -
journal of computer science
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.161
H-Index - 28
eISSN - 1552-6607
pISSN - 1549-3636
DOI - 10.3844/jcssp.2007.608.616
Subject(s) - computer science , speech recognition , task (project management) , speech technology , voice activity detection , natural language processing , speech processing , artificial intelligence , management , economics
In the present work we overview some recently proposed discrete Fourier transform (DFT)- and discrete wavelet packet transform (DWPT)-based speech parameterization methods and evaluate their performance on the speech recognition task. Specifically, in order to assess the practical value of these less studied speech parameterization methods, we evaluate them in a common experimental setup and compare their performance against traditional techniques, such as the Mel-frequency cepstral coefficients (MFCC) and perceptual linear predictive (PLP) cepstral coefficients which presently dominate the speech recognition field. In particular, utilizing the well established TIMIT speech corpus and employing the Sphinx-III speech recognizer, we present comparative results of 8 different speech parameterization techniques
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom