Speech emotion classification using combined neurogram and INTERSPEECH 2010 paralinguistic challenge features | Zendy

Jassim Wissam A. | Zendy; Paramesran Raveendran | Zendy; Harte Naomi | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Speech emotion classification using combined neurogram and INTERSPEECH 2010 paralinguistic challenge features

Author(s) -

Jassim Wissam A.,

Paramesran Raveendran,

Harte Naomi

Publication year - 2017

Publication title -

iet signal processing

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.384

H-Index - 42

ISSN - 1751-9683

DOI - 10.1049/iet-spr.2016.0336

Subject(s) - computer science , paralanguage , speech recognition , utterance , pattern recognition (psychology) , artificial intelligence , feature (linguistics) , signal (programming language) , emotion classification , support vector machine , communication , linguistics , philosophy , sociology , programming language

Recently, increasing attention has been directed to study and identify the emotional content of a spoken utterance. This study introduces a method to improve emotion classification performance under clean and noisy environments by combining two types of features: the proposed neural‐responses‐based features and the traditional INTERSPEECH 2010 paralinguistic emotion challenge features. The neural‐responses‐based features are represented by the responses of a computational model of the auditory system for listeners with normal hearing. The model simulates the responses of an auditory‐nerve fibre with a characteristic frequency to a speech signal. The simulated responses of the model are represented by the 2D neurogram (time‐frequency representation). The neurogram image is sub‐divided into non‐overlapped blocks and the averaged value of each block is computed. The neurogram features and the traditional emotion features are combined together to form the feature vector for each speech signal. The features are trained using support vector machines to predict the emotion of speech. The performance of the proposed method is evaluated on two well‐known databases: the eNTERFACE and Berlin emotional speech data set. The results show that the proposed method performed better when compared with the classification results obtained using neurogram and INTERSPEECH features separately.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore