z-logo
open-access-imgOpen Access
Interaffection of Multiple Datasets with Neural Networks in Speech Emotion Recognition
Author(s) -
Ronnypetson Da Silva,
Valter M. Filho,
Mário José de Souza
Publication year - 2020
Language(s) - English
Resource type - Conference proceedings
DOI - 10.5753/eniac.2020.12141
Subject(s) - computer science , deep neural networks , task (project management) , artificial neural network , emotion recognition , artificial intelligence , machine learning , pattern recognition (psychology) , speech recognition , management , economics
Many works that apply Deep Neural Networks (DNNs) to Speech Emotion Recognition (SER) use single datasets or train and evaluate the models separately when using multiple datasets. Those datasets are constructed with specific guidelines and the subjective nature of the labels for SER makes it difficult to obtain robust and general models. We investigate how DNNs learn shared representations for different datasets in both multi-task and unified setups. We also analyse how each dataset benefits from others in different combinations of datasets and popular neural network architectures. We show that the longstanding belief of more data resulting in more general models doesn’t always hold for SER, as different dataset and meta-parameter combinations hold the best result for each of the analysed datasets.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here