Leveraging synthetic data to develop a machine learning model for voiding flow rate prediction from audio signals | Zendy

Marcos Lazaro Alvarez | Zendy; Alfonso Bahillo | Zendy; Laura Arjona | Zendy; Diogo Marcelo Nogueira | Zendy; Elsa Ferreira Gomes | Zendy; Alipio M. Jorge | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Leveraging synthetic data to develop a machine learning model for voiding flow rate prediction from audio signals

Author(s) -

Marcos Lazaro Alvarez,

Alfonso Bahillo,

Laura Arjona,

Diogo Marcelo Nogueira,

Elsa Ferreira Gomes,

Alipio M. Jorge

Publication year - 2025

Publication title -

ieee access

Language(s) - English

Resource type - Magazines

SCImago Journal Rank - 0.587

H-Index - 127

eISSN - 2169-3536

DOI - 10.1109/access.2025.3590626

Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation

Sound-based uroflowmetry (SU) is a non-invasive technique emerging as an alternative to traditional uroflowmetry (UF) to calculate the voiding flow rate based on the sound generated by the urine impacting the water in a toilet, enabling remote monitoring and reducing the patient burden and clinical costs. This study trains four different machine learning (ML) models (random forest, gradient boosting, support vector machine and convolutional neural network) using both regression and classification approaches to predict and categorize the voiding flow rate from sound events. The models were trained with a dataset that contains sounds from synthetic void events generated with a high precision peristaltic pump and a traditional toilet. Sound was simultaneously recorded with three devices: Ultramic384k, Mi A1 smartphone and Oppo Smartwatch. To extract the audio features, our analysis showed that segmenting the audio signals into 1000 ms segments with frequencies up to 16 kHz provided the best results. Results show that random forest achieved the best performance in both regression and classification tasks, with a mean absolute error (MAE) of 0.9, 0.7 and 0.9 ml/s and quadratic weighted kappa (QWK) of 0.99, 1.0 and 1.0 for the three devices. To evaluate the models in a real environment and assess the effectiveness of training with synthetic data, the best-performing models were retrained and validated using a real voiding sounds datset. The results reported an MAE below 2.5 ml/s and a QWK above 0.86 for regression and classification tasks, respectively.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research