z-logo
open-access-imgOpen Access
Sparse Coding of Pitch Contours with Deep Auto-Encoders
Author(s) -
Nicolas Obin,
Julie Belião
Publication year - 2018
Publication title -
speech prosody
Language(s) - English
Resource type - Conference proceedings
SCImago Journal Rank - 0.274
H-Index - 18
ISSN - 2333-2042
DOI - 10.21437/speechprosody.2018-161
Subject(s) - computer science , cluster analysis , artificial intelligence , deep learning , pattern recognition (psychology) , autoencoder , encoder , feature learning , encoding (memory) , speech recognition , algorithm , operating system
This paper presents a sparse coding algorithm based on deep auto-encoders for the stylization and the clustering of pitch contours. The main objective of the proposed algorithm is to learn a set of pitch templates that can be easily interpreted by humans and whose combination can approximate efficiently the observed pitch contours. The proposed learning architecture is based on deep auto-encoders, commonly used to learn non-linear and low-dimensional latent representations that approximate the observed data. The proposed deep architecture is based on stacked auto-encoders and the sparsity of the network is investigated in order to learn a more robust and general representation of the pitch contours (dropout, denoising auto-encoder, sparsity regularization). The deep auto-encoding of the pitch contours is illustrated and discussed on the TIMIT American-English speech database† with comparison of other existing stylization and clustering algorithms.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom