Partitioning of Posteriorgrams Using Siamese Models for Unsupervised Acoustic Modelling | Zendy

Arvid Fahlström Myrman | Zendy; Giampiero Salvi | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Partitioning of Posteriorgrams Using Siamese Models for Unsupervised Acoustic Modelling

Author(s) -

Arvid Fahlström Myrman,

Giampiero Salvi

Publication year - 2017

Publication title -

kth publication database diva (kth royal institute of technology)

Language(s) - English

Resource type - Conference proceedings

DOI - 10.21437/glu.2017-6

Subject(s) - computer science , artificial intelligence

Unsupervised methods tend to discover highly speaker-specific representations of speech. We propose a method for improving the quality of posteriorgrams generated from an unsupervised model through partitioning of the latent classes. We do this by training a sparse siamese model to find a linear transformation of the input posteriorgrams to lower-dimensional posteriorgrams. The siamese model makes use of same-category and differentcategory speech fragment pairs obtained by unsupervised term discovery. After training, the model is converted into an exact partitioning of the posteriorgrams. We evaluate the model on the minimal-pair ABX task in the context of the Zero Resource Speech Challenge. We are able to demonstrate that our method significantly reduces the dimensionality of standard Gaussian mixture model posteriorgrams, while still making them more robust to speaker variations. This suggests that the model may be viable as a general post-processing step to improve probabilistic acoustic features obtained by unsupervised learning.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research