Annotated Databases for the Recognition of Screen-Rendered Text | Zendy

Steffen  Wachenfeld | Zendy; Hans-Ulrich  Klein | Zendy; Xiaoyi  Jiang | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Annotated Databases for the Recognition of Screen-Rendered Text

Author(s) -

Steffen Wachenfeld,

Hans-Ulrich Klein,

Xiaoyi Jiang

Publication year - 2007

Publication title -

ninth international conference on document analysis and recognition (icdar 2007)

Language(s) - English

DOI - 10.1109/icdar.2007.58

The recognition of screen-rendered text is a novel task. It is performed e.g. by translation tools which allow users to click on any text on the screen and give a translation. Also some commercial OCR programs start to address the problem of reading screenshots. Optical character recognition on screen-shot images can be very challenging due to very small and smoothed fonts. In order to build and compare recognition approaches for screen-rendered text, the availability of standard databases is a fundamental prerequisite. In this paper two freely available databases are presented, one that consists of annotated screenshot images of 28080 single characters and another holding 400 words extracted from documents plus 2 400 generated isolated words. Both databases include meta-information such as x-height, font type, style and rendering conditions. At the example of a developed recognition system, it is shown how these databases can serve for training, testing and optimization.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research