
A ROBUST BINARIZATION AND TEXT LINE DETECTION IN HISTORICAL HANDWRITTEN DOCUMENTS ANALYSIS
Author(s) -
Jakub Pach,
Piotr Bilski
Publication year - 2016
Publication title -
computing
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.184
H-Index - 11
eISSN - 2312-5381
pISSN - 1727-6209
DOI - 10.47839/ijc.15.3.848
Subject(s) - hough transform , computer science , artificial intelligence , pattern recognition (psychology) , preprocessor , histogram , connected component , maxima and minima , block (permutation group theory) , line (geometry) , image (mathematics) , gaussian , point (geometry) , luminance , mathematics , mathematical analysis , physics , geometry , quantum mechanics
In this paper, we present a novel method of detecting text lines in handwritten documents based on the Block-Based Hough Transform. To maximize its efficiency, the robust binarization algorithm was applied. It is based on the Gaussian filtering and tackles the non-uniform luminance. The proposed technique consists of three steps: preprocessing, detecting of potential text lines and eliminating the false ones. The first step covers the image binarization, extraction of connected components and selection of supporting connected components based on the local maxima in the vertical histogram stripes. Secondly, the appropriate subset of connected components supplemented by one-point components is selected. Finally, the block-based Hough transform is applied to detect potential text lines and found the ones identified incorrectly. The proposed method is applied to the analysis of the fifteenth century Latin manuscripts. Our approach is more effective than the traditional ones, in the best cases by twenty percent.