z-logo
open-access-imgOpen Access
Partitioning of Posteriorgrams Using Siamese Models for Unsupervised Acoustic Modelling
Author(s) -
Arvid Fahlström Myrman,
Giampiero Salvi
Publication year - 2017
Publication title -
kth publication database diva (kth royal institute of technology)
Language(s) - English
Resource type - Conference proceedings
DOI - 10.21437/glu.2017-6
Subject(s) - computer science , artificial intelligence
Unsupervised methods tend to discover highly speaker-specific representations of speech. We propose a method for improving the quality of posteriorgrams generated from an unsupervised model through partitioning of the latent classes. We do this by training a sparse siamese model to find a linear transformation of the input posteriorgrams to lower-dimensional posteriorgrams. The siamese model makes use of same-category and differentcategory speech fragment pairs obtained by unsupervised term discovery. After training, the model is converted into an exact partitioning of the posteriorgrams. We evaluate the model on the minimal-pair ABX task in the context of the Zero Resource Speech Challenge. We are able to demonstrate that our method significantly reduces the dimensionality of standard Gaussian mixture model posteriorgrams, while still making them more robust to speaker variations. This suggests that the model may be viable as a general post-processing step to improve probabilistic acoustic features obtained by unsupervised learning.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom