Research Library

open-access-imgOpen AccessANIM-400K: A Large-Scale Dataset for Automated End-To-End Dubbing of Video
Author(s)
Kevin Cai,
Chonghua Liu,
David M. Chan
Publication year2024
The Internet's wealth of content, with up to 60% published in English,starkly contrasts the global population, where only 18.8% are English speakers,and just 5.1% consider it their native language, leading to disparities inonline information access. Unfortunately, automated processes for dubbing ofvideo - replacing the audio track of a video with a translated alternative -remains a complex and challenging task due to pipelines, necessitating precisetiming, facial movement synchronization, and prosody matching. While end-to-enddubbing offers a solution, data scarcity continues to impede the progress ofboth end-to-end and pipeline-based methods. In this work, we introduceAnim-400K, a comprehensive dataset of over 425K aligned animated video segmentsin Japanese and English supporting various video-related tasks, includingautomated dubbing, simultaneous translation, guided video summarization, andgenre/theme/style classification. Our dataset is made publicly available forresearch purposes at https://github.com/davidmchan/Anim400K.
Language(s)English

Seeing content that should not be on Zendy? Contact us.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here