z-logo
open-access-imgOpen Access
Visual Speech Recognition
Author(s) -
Supriya A. Patil,
Vaibhav Dhoble,
Saatvik Gawade,
Pratiksha Jagdale,
Rohan Jinde
Publication year - 2022
Publication title -
international journal of advanced research in science communication and technology
Language(s) - English
Resource type - Journals
ISSN - 2581-9429
DOI - 10.48175/ijarsct-2874
Subject(s) - computer science , handset , speech recognition , robustness (evolution) , artificial intelligence , audio visual , computer vision , face (sociological concept) , facial recognition system , noise reduction , feature extraction , multimedia , social science , biochemistry , chemistry , sociology , gene , operating system
The audio-visual speech recognition approach attempts to boost noise-robustness in mobile situations by extracting lip movement from side-face images. Although earlier bimodal speech recognition algorithms used frontal face (lip) images, these approaches are difficult for consumers to utilize because they need them to talk while holding a device with a camera in front of their face. Our proposed solution, which uses a small camera put in a handset to capture lip movement, is more natural, simple, and convenient. This approach also effectively avoids a reduction in the input speech's signal-to-noise ratio (SNR). Optical-flow analysis extracts visual features, which are then coupled with audio features in the context of CNN-based recognition.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom