Visual Lip Reading using 3D-DCT and 3D-DWT and LSDA
Author(s) -
Smrithi Sunil,
Suprava Patnaik
Publication year - 2016
Publication title -
international journal of computer applications
Language(s) - English
Resource type - Journals
ISSN - 0975-8887
DOI - 10.5120/ijca2016908308
Subject(s) - computer science , reading (process) , artificial intelligence , computer vision , linguistics , philosophy
Human uses visual information while trying to understand speech, especially in noisy conditions or in situations where the audio signal is not available. Lip reading is the technique of a comprehensive understanding the underlying speech by processing on the movement of lips. However, the recognition of lip motion is a difficult task since the region of interest (ROI) is nonlinear and noisy. In proposed method lip reading system we have used two stage feature extraction model which is precised, discriminative and computation efficient. The first stage 3D Discrete Wavelet Transform (3D-DWT) or 3D Discrete Cosine Transform (3D-DCT) is used and the second stage is Locality Sensitive Discriminant Analysis (LSDA) to trim down the feature dimensions. These features make a novel lip reading system with small feature vector size. In addition to the novel feature extraction technique, the performance of Naive Bayes and SVM classifier is compared. CUAVE database of 0 to 9 utterances in English is used for experimentation. Results of 3 dimension transform with LSDA are compared with 2 dimension transform with LSDA. Experimental results show that 3D-DWT+LSDA feature mining are compared with 3D-DWT with PCA or LDA. 3DDWT+LSDA result is also compared with 3D-DCT + LSDA.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom