z-logo
open-access-imgOpen Access
Classification of printed and handwritten text using hybrid techniques for gurumukhi script
Author(s) -
Manpreet Kaur,
Balwinder Singh
Publication year - 2019
Publication title -
international journal of engineering and computer science
Language(s) - English
Resource type - Journals
ISSN - 2319-7242
DOI - 10.18535/ijecs/v8i04.4298
Subject(s) - computer science , optical character recognition , character (mathematics) , artificial intelligence , scripting language , process (computing) , natural language processing , segmentation , character encoding , feature (linguistics) , pattern recognition (psychology) , document processing , intelligent word recognition , scanner , character recognition , intelligent character recognition , image (mathematics) , speech recognition , linguistics , philosophy , geometry , mathematics , operating system
Text classification is a crucial step for optical character recognition. The output of the scanner is non- editable. Though one cannot make any change in scanned text image, if required. Thus, this provides the feed for the theory of optical character recognition. Optical Character Recognition (OCR) is the process of converting scanned images of machine printed or handwritten text into a computer readable format. The process of OCR involves several steps including pre-processing after image acquisition, segmentation, feature extraction, and classification. The incorrect classification is like a garbage in and garbage out. Existing methods focuses only upon the classification of unmixed characters in Arab, English, Latin, Farsi, Bangla, and Devnagari script. The Hybrid Techniques is solving the mixed (Machine printed and handwritten) character classification problem. Classification is carried out on different kind of daily use forms like as self declaration forms, admission forms, verification forms, university forms, certificates, banking forms, dairy forms, Punjab govt forms etc. The proposed technique is capable to classify the handwritten and machine printed text written in Gurumukhi script in mixed text. The proposed technique has been tested on 150 different kinds of forms in Gurumukhi and Roman scripts. The proposed techniques achieve 93% accuracy on mixed character form and 96% accuracy achieves on unmixed character forms. The overall accuracy of the proposed technique is 94.5%.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here