z-logo
open-access-imgOpen Access
Adding Crowd Noise to Sports Commentary using Generative Models
Author(s) -
Neil Shah,
Dharmeshkumar Agrawal,
Niranajan Pedanekar
Publication year - 2021
Language(s) - English
Resource type - Conference proceedings
DOI - 10.5753/lique.2021.15715
Subject(s) - computer science , stadium , unavailability , noise (video) , sound (geography) , generative grammar , similarity (geometry) , speech recognition , human–computer interaction , artificial intelligence , acoustics , engineering , mathematics , physics , geometry , reliability engineering , image (mathematics)
Crowd noise forms an integral part of a live sports experience. In the post-COVID era, when live audiences are absent, crowd noise needs to be added to the live commentary. This paper exploits the correlation between commentary and crowd noise of a live sports event and presents an audio stylizing sports commentary method by generating live stadium-like sound using neural generative models. We use the Generative Adversarial Network (GAN)-based architectures such as Cycle-consistent GANs (Cycle-GANs) and Mel-GANs to generate live stadium-like sound samples given the live commentary. Due to the unavailability of raw commentary sound samples, we use end-to-end time-domain source separation models (SEGAN and Wave-U-Net) to extract commentary sound from combined recordings of the live sound acquired from YouTube highlights of soccer videos. We present a qualitative and a subjective user evaluation of the similarity of the generated live sound with the reference live sound.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here