z-logo
Premium
Music genre recognition using convolutional recurrent neural network architecture
Author(s) -
Bisharad Dipjyoti,
Laskar Rabul Hussain
Publication year - 2019
Publication title -
expert systems
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.365
H-Index - 38
eISSN - 1468-0394
pISSN - 0266-4720
DOI - 10.1111/exsy.12429
Subject(s) - computer science , classifier (uml) , convolutional neural network , artificial intelligence , feature extraction , speech recognition , architecture , pattern recognition (psychology) , set (abstract data type) , art , visual arts , programming language
The genre is an abstract feature, but still, it is considered to be one of the important characteristics of music. Genre recognition forms an essential component for a large number of commercial music applications. Most of the existing music genre recognition algorithms are based on manual feature extraction techniques. These extracted features are used to develop a classifier model to identify the genre. However, in many cases, it has been observed that a set of features giving excellent accuracy fails to explain the underlying typical characteristics of music genres. It has also been observed that some of the features provide a satisfactory level of performance on a particular dataset but fail to provide similar performance on other datasets. Hence, each dataset mostly requires manual selection of appropriate acoustic features to achieve an adequate level of performance on it. In this paper, we propose a genre recognition algorithm that uses almost no handcrafted features. The convolutional recurrent neural network‐based model proposed in this study is trained on melspectrogram extracted from 3‐s duration audio clips taken from GTZAN dataset. The proposed model provides an accuracy of 85.36% on 10‐class genre classification. The same model has been trained and tested on 10 genres of MagnaTagATune dataset having 18,476 clips of 29‐s duration. The model has yielded an accuracy of 86.06%. The experimental results suggest that the proposed architecture with melspectrogram as input feature is capable of providing consistent performances across the different datasets

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here