z-logo
open-access-imgOpen Access
Automatic video caption detection and extraction in the DCT compressed domain
Author(s) -
Chin-Fu Tsao,
Yuhao Chen,
Jin-Hau Kuo,
Chia-Wei Lin,
JaLing Wu
Publication year - 2005
Publication title -
visual communications and image processing
Language(s) - English
Resource type - Conference proceedings
ISSN - 2522-6770
DOI - 10.1117/12.631588
Subject(s) - computer science , artificial intelligence , computer vision , discrete cosine transform , decoding methods , frame (networking) , pixel , domain (mathematical analysis) , semantics (computer science) , texture (cosmology) , image (mathematics) , pattern recognition (psychology) , mathematical analysis , telecommunications , mathematics , programming language
The text in a video frame can help us to understand the semantics of video content directly. Although there are many approaches that can automatically detect and localize text a video, most of them use the original pixels of an image to find the text regions. In this paper, we present an approach to automatically localize captions in MPEG compressed videos. Caption regions are segmented from background by using their distinguishing texture characteristics. Unlike previously published ones which fully decompress the video sequence before extracting the caption regions or only extract text regions in Intra-(I-) frames, our approach detect and localize caption regions directly in the DCT compressed domain. Therefore, only very small amounts of decoding processes are required. Experiments show that a good caption detection rate can be obtained, and the average recalls of Intra- and Inter-frame detections are 97.77% and 97.84%, respectively.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom