z-logo
open-access-imgOpen Access
From Vision to Voice: A Multi-Modal Assistive Framework for the Physically Impaired
Author(s) -
Suhas Bhat,
Prajwal Bhat,
Sucheta Kolekar
Publication year - 2025
Publication title -
ieee access
Language(s) - English
Resource type - Magazines
SCImago Journal Rank - 0.587
H-Index - 127
eISSN - 2169-3536
DOI - 10.1109/access.2025.3590237
Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation
Providing people with visual and physical limitations their ability to access textual content continues to be a difficult challenge. A desktop-assisted system with automated computer processing will enable the conversion of text found in images into audible speech. The application uses Python to develop its interface with Tkinter libraries and implements Tesseract OCR for optical character recognition that receives images through real-time capture enabled by OpenCV. Through the googletrans library the system enables multilingual operations(more then 100 languages) for text processing and translation among all languages accessible through Google Translate. The system converts extracted or translated text into speech output using Google Text-to-Speech (gTTS) that plays back audio through system default media players as .mp3 files. Users experience intuitive interaction with the interface because it features hover effect characteristics together with accessible control elements along with language selection through a dropdown menu. Through an expandable structural design the system delivers multilingual text-to-speech capabilities which prove useful in assistive technology applications for accessibility needs.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom