Sparse Coding of Pitch Contours with Deep Auto-Encoders | Zendy

Nicolas Obin | Zendy; Julie Belião | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Sparse Coding of Pitch Contours with Deep Auto-Encoders

Author(s) -

Nicolas Obin,

Julie Belião

Publication year - 2018

Publication title -

speech prosody

Language(s) - English

Resource type - Conference proceedings

SCImago Journal Rank - 0.274

H-Index - 18

ISSN - 2333-2042

DOI - 10.21437/speechprosody.2018-161

Subject(s) - computer science , cluster analysis , artificial intelligence , deep learning , pattern recognition (psychology) , autoencoder , encoder , feature learning , encoding (memory) , speech recognition , algorithm , operating system

This paper presents a sparse coding algorithm based on deep auto-encoders for the stylization and the clustering of pitch contours. The main objective of the proposed algorithm is to learn a set of pitch templates that can be easily interpreted by humans and whose combination can approximate efficiently the observed pitch contours. The proposed learning architecture is based on deep auto-encoders, commonly used to learn non-linear and low-dimensional latent representations that approximate the observed data. The proposed deep architecture is based on stacked auto-encoders and the sparsity of the network is investigated in order to learn a more robust and general representation of the pitch contours (dropout, denoising auto-encoder, sparsity regularization). The deep auto-encoding of the pitch contours is illustrated and discussed on the TIMIT American-English speech database† with comparison of other existing stylization and clustering algorithms.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research