Text area detection in handwritten documents scanned for further processing | Zendy

Jakub Pach | Zendy; Artur Krupa | Zendy; Izabella Antoniuk | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Text area detection in handwritten documents scanned for further processing

Author(s) -

Jakub Pach,

Artur Krupa,

Izabella Antoniuk

Publication year - 2020

Publication title -

machine graphics and vision

Language(s) - English

Resource type - Journals

eISSN - 2720-250X

pISSN - 1230-0535

DOI - 10.22630/mgv.2020.29.1.2

Subject(s) - computer science , image processing , artificial intelligence , column (typography) , document processing , noise (video) , text recognition , binary image , pattern recognition (psychology) , line (geometry) , image (mathematics) , binary number , optical character recognition , computer vision , arithmetic , mathematics , telecommunications , geometry , frame (networking)

In this paper we present an approach to text area detection using binary images, Constrained Run Length Algorithm and other noise reduction methods of removing the artefacts. Text processing includes various activities, most of which are related to preparing input data for further operations in the best possible way, that will not hinder the OCR algorithms. This is especially the case when handwritten manuscripts are considered, and even more so with very old documents. We present our methodology for text area detection problem, which is capable of removing most of irrelevant objects, including elements such as page edges, stains, folds etc. At the same time the presented method can handle multi-column texts or varying line thickness. The generated mask can accurately mark the actual text area, so that the output image can be easily used in further text processing steps.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research