Recognizing Emotions From Whispered Speech Based on Acoustic Feature Transfer Learning | Zendy

Jun Deng | Zendy; Sascha Fruhholz | Zendy; Zixing Zhang | Zendy; Bjorn Schuller | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Recognizing Emotions From Whispered Speech Based on Acoustic Feature Transfer Learning

Author(s) -

Jun Deng,

Sascha Fruhholz,

Zixing Zhang,

Bjorn Schuller

Publication year - 2017

Publication title -

ieee access

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.587

H-Index - 127

ISSN - 2169-3536

DOI - 10.1109/access.2017.2672722

Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation

Whispered speech, as an alternative speaking style for normal phonated (non-whispered) speech, has received little attention in speech emotion recognition. Currently, speech emotion recognition systems are exclusively designed to process normal phonated speech and can result in significantly degraded performance on whispered speech because of the fundamental differences between normal phonated speech and whispered speech in vocal excitation and vocal tract function. This study, motivated by the recent successes of feature transfer learning, sheds some light on this topic by proposing three feature transfer learning methods based on denoising autoencoders, shared-hidden-layer autoencoders, and extreme learning machines autoencoders. Without the availability of labeled whispered speech data in the training phase, in turn, the three proposed methods can help modern emotion recognition models trained on normal phonated speech to reliably handle also whispered speech. Throughout extensive experiments on the Geneva Whispered Emotion Corpus and the Berlin Emotional Speech Database, we compare our methods to alternative methods reported to perform well for a wide range of speech emotion recognition tasks and find that the proposed methods provide significant superior performance on both normal phonated and whispered speech.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research