Cross-Lingual Embedding Clustering for Hierarchical Softmax in Low-Resource Multilingual Speech Recognition | Zendy

Zhengdong Yang | Zendy; Qianying Liu | Zendy; Sheng Li | Zendy; Fei Cheng | Zendy; Chenhui Chu | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Cross-Lingual Embedding Clustering for Hierarchical Softmax in Low-Resource Multilingual Speech Recognition

Author(s) -

Zhengdong Yang,

Qianying Liu,

Sheng Li,

Fei Cheng,

Chenhui Chu

Publication year - 2025

Publication title -

ieee transactions on audio, speech and language processing

Language(s) - English

Resource type - Magazines

eISSN - 2998-4173

DOI - 10.1109/taslpro.2025.3617233

Subject(s) - signal processing and analysis , computing and processing , fields, waves and electromagnetics

We present a novel approach centered on the decoding stage of Automatic Speech Recognition (ASR) that enhances multilingual performance, especially for low-resource languages. It utilizes a cross-lingual embedding clustering method to construct a hierarchical Softmax (H-Softmax) decoder, which enables similar tokens across different languages to share similar decoder representations. It addresses the limitations of the previous Huffman-based H-Softmax method, which relied on shallow features in token similarity assessments. Through experiments on a downsampled dataset of 15 languages, we demonstrate the effectiveness of our approach in improving low-resource multilingual ASR accuracy.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research