Detection of Emotion of Speech for RAVDESS Audio Using Hybrid Convolution Neural Network | Zendy

Tanvi Puri | Zendy; Mukesh Soni | Zendy; Gaurav Dhiman | Zendy; Osamah Ibrahim Khalaf | Zendy; Malik Bader Alazzam | Zendy; Ihtiram Raza Khan | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Detection of Emotion of Speech for RAVDESS Audio Using Hybrid Convolution Neural Network

Author(s) -

Tanvi Puri,

Mukesh Soni,

Gaurav Dhiman,

Osamah Ibrahim Khalaf,

Malik Bader Alazzam,

Ihtiram Raza Khan

Publication year - 2022

Publication title -

journal of healthcare engineering

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.509

H-Index - 29

eISSN - 2040-2309

pISSN - 2040-2295

DOI - 10.1155/2022/8472947

Subject(s) - disgust , computer science , speech recognition , surprise , spectrogram , convolutional neural network , artificial neural network , hidden markov model , mel frequency cepstrum , emotion classification , artificial intelligence , natural language processing , feature extraction , psychology , communication , anger , psychiatry

Every human being has emotion for every item related to them. For every customer, their emotion can help the customer representative to understand their requirement. So, speech emotion recognition plays an important role in the interaction between humans. Now, the intelligent system can help to improve the performance for which we design the convolution neural network (CNN) based network that can classify emotions in different categories like positive, negative, or more specific. In this paper, we use the Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) audio records. The Log Mel Spectrogram and Mel-Frequency Cepstral Coefficients (MFCCs) were used to feature the raw audio file. These properties were used in the classification of emotions using techniques, such as Long Short-Term Memory (LSTM), CNNs, Hidden Markov models (HMMs), and Deep Neural Networks (DNNs). For this paper, we have divided the emotions into three sections for males and females. In the first section, we divide the emotion into two classes as positive. In the second section, we divide the emotion into three classes such as positive, negative, and neutral. In the third section, we divide the emotions into 8 different classes such as happy, sad, angry, fearful, surprise, disgust expressions, calm, and fearful emotions. For these three sections, we proposed the model which contains the eight consecutive layers of the 2D convolution neural method. The purposed model gives the better-performed categories to other previously given models. Now, we can identify the emotion of the consumer in better ways.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research