A novel technique for estimation of skew in binary text document images based on linear regression analysis
Author(s) -
Palaiahnakote Shivakumara,
G. Hemantha Kumar,
D. S. Guru,
P. Nagabhushan
Publication year - 2005
Publication title -
sadhana
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.268
H-Index - 49
eISSN - 0973-7677
pISSN - 0256-2499
DOI - 10.1007/bf02710080
Subject(s) - skew , computer science , digitization , binary number , artificial intelligence , pixel , boundary (topology) , document layout analysis , space (punctuation) , field (mathematics) , pattern recognition (psychology) , linear regression , optical character recognition , tilt (camera) , image (mathematics) , mathematics , computer vision , geometry , machine learning , telecommunications , mathematical analysis , arithmetic , pure mathematics , operating system
When a document is scanned either mechanically or manually for digitization, it often suffers from some degree of skew or tilt. Skew-angle detection plays an important role in the field of document analysis systems and OCR in achieving the expected accuracy. In this paper, we consider skew estimation of Roman script. The method uses the boundary growing approach to extract the lowermost and uppermost coordinates of pixels of characters of text lines present in the document, which can be subjected to linear regression analysis (LRA) to determine the skew angle of a skewed document. Further, the proposed technique works fine for scaled text binary documents also. The technique works based on the assumption that the space between the text lines is greater than the space between the words and characters. Finally, in order to evaluate the performance of the proposed methodology we compare the experimental results with those of well-known existing methods
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom