z-logo
open-access-imgOpen Access
Speech emotion classification using combined neurogram and INTERSPEECH 2010 paralinguistic challenge features
Author(s) -
Jassim Wissam A.,
Paramesran Raveendran,
Harte Naomi
Publication year - 2017
Publication title -
iet signal processing
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.384
H-Index - 42
ISSN - 1751-9683
DOI - 10.1049/iet-spr.2016.0336
Subject(s) - computer science , paralanguage , speech recognition , utterance , pattern recognition (psychology) , artificial intelligence , feature (linguistics) , signal (programming language) , emotion classification , support vector machine , communication , linguistics , philosophy , sociology , programming language
Recently, increasing attention has been directed to study and identify the emotional content of a spoken utterance. This study introduces a method to improve emotion classification performance under clean and noisy environments by combining two types of features: the proposed neural‐responses‐based features and the traditional INTERSPEECH 2010 paralinguistic emotion challenge features. The neural‐responses‐based features are represented by the responses of a computational model of the auditory system for listeners with normal hearing. The model simulates the responses of an auditory‐nerve fibre with a characteristic frequency to a speech signal. The simulated responses of the model are represented by the 2D neurogram (time‐frequency representation). The neurogram image is sub‐divided into non‐overlapped blocks and the averaged value of each block is computed. The neurogram features and the traditional emotion features are combined together to form the feature vector for each speech signal. The features are trained using support vector machines to predict the emotion of speech. The performance of the proposed method is evaluated on two well‐known databases: the eNTERFACE and Berlin emotional speech data set. The results show that the proposed method performed better when compared with the classification results obtained using neurogram and INTERSPEECH features separately.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here