Open Access
Extending Deep Rhythm for Tempo and Genre Estimation Using Complex Convolutions, Multitask Learning and Multi-input Network
Author(s) -
Hadrien Foroughmand,
Geoffroy Peeters
Publication year - 2022
Publication title -
journal of creative music systems
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.12
H-Index - 4
ISSN - 2399-7656
DOI - 10.5920/jcms.887
Subject(s) - rhythm , timbre , spectrogram , computer science , focus (optics) , convolutional neural network , representation (politics) , speech recognition , artificial intelligence , joint (building) , deep learning , process (computing) , musical , architectural engineering , art , physics , politics , law , political science , optics , visual arts , operating system , engineering , philosophy , aesthetics
Tempo and genre are two inter-leaved aspects of music, genres are often associated to rhythm patterns which are played in specific tempo ranges.In this paper, we focus on the Deep Rhythm system based on a harmonic representation of rhythm used as an input to a convolutional neural network.To consider the relationships between frequency bands, we process complex-valued inputs through complex-convolutions.We also study the joint estimation of tempo/genre using a multitask learning approach. Finally, we study the addition of a second input convolutional branch to the system applied to a mel-spectrogram input dedicated to the timbre.This multi-input approach allows to improve the performances for tempo and genre estimation.