z-logo
open-access-imgOpen Access
Artificial Neural Network-Based Speech Recognition Using Dwt Analysis Applied On Isolated Words From Oriental Languages
Author(s) -
Bacha Rehmam,
Zahid Halim,
Ghulam Abbas,
Muhammad Tufail
Publication year - 2015
Publication title -
malaysian journal of computer science
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.197
H-Index - 18
ISSN - 0127-9084
DOI - 10.22452/mjcs.vol28no3.5
Subject(s) - computer science , speech recognition , discrete wavelet transform , artificial neural network , haar wavelet , artificial intelligence , audio mining , pattern recognition (psychology) , filter (signal processing) , task (project management) , feature (linguistics) , speech processing , voice activity detection , wavelet transform , wavelet , linguistics , philosophy , management , computer vision , economics
Speech recognition is an emerging research area having its focus on human computer interactions (HCI) and expert systems. Analyzing speech signals are often tricky for processing, due to the non-stationary nature of audio signals. The work in this paper presents a system for speaker independent speech recognition, which is tested on isolated words from three oriental languages, i.e., Urdu,Persian, and Pashto. The proposed approach combines discrete wavelet transform (DWT) and feed-forward artificial neural network (FFANN) for the purpose of speech recognition. DWT is used for feature extraction and the FFANN is utilized for the classification purpose. The task of isolated word recognition is accomplished with speech signal capturing, creating a code bank of speech samples, and then by applying pre-processing techniques.For classifying a wave sample, four layered FFANN model is used with resilient back-propagation (Rprop). The proposed system yields high accuracy for two and five classes.For db-8 level-5 DWT filter 98.40%, 95.73%, and 95.20% accuracy rate is achieved with 10, 15, and 20 classes, respectively. Haar level-5 DWT filter shows 97.20%, 94.40%, and 91% accuracy ratefor 10, 15, and 20 classes, respectively. The proposed system is also compared with a baseline method where it shows better performance. The proposed system can be utilized as a communication interface to computing and mobile devices for low literacy regions.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom