Learning Word Embeddings: Unsupervised Methods for Fixed-size Representations of Variable-length Speech Segments | Zendy

Nils Holzenberger | Zendy; Mingxing Du | Zendy; Julien Karadayi | Zendy; Rachid Riad | Zendy; Emmanuel Dupoux | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Learning Word Embeddings: Unsupervised Methods for Fixed-size Representations of Variable-length Speech Segments

Author(s) -

Nils Holzenberger,

Mingxing Du,

Julien Karadayi,

Rachid Riad,

Emmanuel Dupoux

Publication year - 2018

Publication title -

interspeech 2022

Language(s) - English

Resource type - Conference proceedings

DOI - 10.21437/interspeech.2018-2364

Subject(s) - upsampling , computer science , variable (mathematics) , feature (linguistics) , word (group theory) , speech recognition , variety (cybernetics) , artificial intelligence , abx test , natural language processing , pattern recognition (psychology) , mathematics , statistics , philosophy , mathematical analysis , linguistics , geometry , image (mathematics)

Fixed-length embeddings of words are very useful for a variety of tasks in speech and language processing. Here we systematically explore two methods of computing fixed-length embeddings for variable-length sequences. We evaluate their susceptibility to phonetic and speaker-specific variability on English, a high resource language and Xitsonga, a low resource language, using two evaluation metrics: ABX word discrimination and ROC-AUC on same-different phoneme n-grams. We show that a simple downsampling method supplemented with length information can outperform the variable-length input feature representation on both evaluations. Recurrent autoencoders, trained without supervision, can yield even better results at the expense of increased computational complexity.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research