ANIM-400K: A Large-Scale Dataset for Automated End-To-End Dubbing of  Video | Zendy

Kevin Cai | Zendy; Chonghua Liu | Zendy; David M. Chan | Zendy

Research Library

ZAIA - AI Assistant About Blog Pricing Contact

Open AccessANIM-400K: A Large-Scale Dataset for Automated End-To-End Dubbing of Video

Open Access

Author(s)

Kevin Cai,

Chonghua Liu,

David M. Chan

Publication year2024

The Internet's wealth of content, with up to 60% published in English,starkly contrasts the global population, where only 18.8% are English speakers,and just 5.1% consider it their native language, leading to disparities inonline information access. Unfortunately, automated processes for dubbing ofvideo - replacing the audio track of a video with a translated alternative -remains a complex and challenging task due to pipelines, necessitating precisetiming, facial movement synchronization, and prosody matching. While end-to-enddubbing offers a solution, data scarcity continues to impede the progress ofboth end-to-end and pipeline-based methods. In this work, we introduceAnim-400K, a comprehensive dataset of over 425K aligned animated video segmentsin Japanese and English supporting various video-related tasks, includingautomated dubbing, simultaneous translation, guided video summarization, andgenre/theme/style classification. Our dataset is made publicly available forresearch purposes at https://github.com/davidmchan/Anim400K.

Language(s)English

Seeing content that should not be on Zendy? Contact us.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore