z-logo
open-access-imgOpen Access
Deep Learning Approaches for Voice Emotion Recognition Using Sentiment-Arousal Space
Author(s) -
Narek T. Tumanyan
Publication year - 2021
Publication title -
mathematical problems of computer science
Language(s) - English
Resource type - Journals
eISSN - 2738-2788
pISSN - 2579-2784
DOI - 10.51408/1963-0077
Subject(s) - computer science , speech recognition , representation (politics) , convolutional neural network , artificial intelligence , feature learning , mel frequency cepstrum , signal (programming language) , arousal , task (project management) , emotion recognition , pattern recognition (psychology) , feature extraction , psychology , management , neuroscience , politics , political science , law , economics , programming language
In this paper, we present deep learning-based approaches for the task of emotion recognition in voice recordings. A key component of the methods is the representation of emotion categories in a sentiment-arousal space and the usage of this space representation in the supervision signal. Our methods use wavelet and cepstral features as efficient data representations of audio signals. Convolutional Neural Network (CNN) and Long Short Term Memory Network (LSTM) architectures were used in recognition tasks, depending on whether the audio representation was treated as a spatial signal or as a temporal signal. Various recognition approaches were used, and the results were analyzed.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here