
VoiceTalk: A No-Code Approach for Creating Voice-Controlled Smart Home Applications
Author(s) -
Yun-Wei Lin,
Yi-Bing Lin,
Yi-Feng Wu,
Pei-Hsuan Shen
Publication year - 2025
Publication title -
ieee open journal of the computer society
Language(s) - English
Resource type - Magazines
eISSN - 2644-1268
DOI - 10.1109/ojcs.2025.3576725
Subject(s) - computing and processing
This article introduces VoiceTalk, a no-code approach that develops voice-controlled smart home applications without requiring programming expertise. At its core, VoiceTalk utilizes IoTtalk, an IoT application development platform for managing a diverse range of IoT devices. IoTtalk employs a two-tier microservices architecture, enabling users to define and chain applications through an intuitive drag-and-drop line interface. Leveraging its microservice architecture, VoiceTalk integrates IoTtalk with Google Home, offering a no-code solution for voice-controlled applications. VoiceTalk leverages its understanding of smart appliances in the room/house to generate specific prompts. We have compared the translation accuracy of 7 Automatic Speech Recognition (ASR) systems. We make two contributions. First, the no-code VoiceTalk platform significantly simplifies the development of Google Home-like applications. Second, by integrating ASRs with a commercial LLM such as GPT, we dramatically reduce voice-to-text translation errors, for examples, from 5.13% to 0.54% for the Web Speech API and from 2.25% to zero for Whisper Medium. For small-sized open-source LLMs such as Llama 3.2 3B, the errors are reduced to 0.72% for the Web Speech API and to zero for Whisper Medium. Furthermore, Device LLM Agent of VoiceTalk can be easily extended to integrate IoTtalk with other voice platforms, such as AWS Alexa.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom