z-logo
open-access-imgOpen Access
Learning Word Embeddings: Unsupervised Methods for Fixed-size Representations of Variable-length Speech Segments
Author(s) -
Nils Holzenberger,
Mingxing Du,
Julien Karadayi,
Rachid Riad,
Emmanuel Dupoux
Publication year - 2018
Publication title -
interspeech 2022
Language(s) - English
Resource type - Conference proceedings
DOI - 10.21437/interspeech.2018-2364
Subject(s) - upsampling , computer science , variable (mathematics) , feature (linguistics) , word (group theory) , speech recognition , variety (cybernetics) , artificial intelligence , abx test , natural language processing , pattern recognition (psychology) , mathematics , statistics , philosophy , mathematical analysis , linguistics , geometry , image (mathematics)
Fixed-length embeddings of words are very useful for a variety of tasks in speech and language processing. Here we systematically explore two methods of computing fixed-length embeddings for variable-length sequences. We evaluate their susceptibility to phonetic and speaker-specific variability on English, a high resource language and Xitsonga, a low resource language, using two evaluation metrics: ABX word discrimination and ROC-AUC on same-different phoneme n-grams. We show that a simple downsampling method supplemented with length information can outperform the variable-length input feature representation on both evaluations. Recurrent autoencoders, trained without supervision, can yield even better results at the expense of increased computational complexity.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom