
Image Purification Technique for Myanmar OCR Applying Skew Angle Detection and Free Skew
Author(s) -
Chit San Lwin,
Xin Wu
Publication year - 2019
Publication title -
international journal of scientific research in science and technology
Language(s) - English
Resource type - Journals
eISSN - 2395-602X
pISSN - 2395-6011
DOI - 10.32628/ijsrst19615
Subject(s) - skew , optical character recognition , computer science , robustness (evolution) , segmentation , artificial intelligence , character (mathematics) , line (geometry) , partition (number theory) , computer vision , pattern recognition (psychology) , image (mathematics) , mathematics , telecommunications , biochemistry , chemistry , geometry , combinatorics , gene
Optical Character Recognition (OCR) is a technology widely adopted for automatic translation of hardcopy text to editable text. The language dependence of the technology makes it far less developed for less popular languages like Myanmar language. Also, the uniqueness and complexity of the Myanmar text system such as touching and complex characters have continued to pose serious challenges to several OCR investigators. In this paper, we propose a new technique to development Myanmar OCR system. Our technique implement skew angle detection and free skew, noisy border correction, extra page elimination, line segmentation from scanned images of Myanmar text. Performance of the proposed method is tested with 430 documents comprising different printed and handwritten Myanmar text of various fonts, sizes, multi-column, tables, stamps or photos, background effects. Our method give an accuracy of 100% for line segmentation and 99.92% for skew angle detection and free skew. The ability of our method to effectively implement global and local skew angle detection, free skew and line segmentation in different handwritten and digital text images of the Myanmar character set with high accuracies confirms the robustness of the technique, its reliability and its suitability for application in many other related languages.