
Design and implementation of android application to extract text from images by using tesseract for English and Hindi
Author(s) -
Brijeshkumar Y. Panchal,
Gaurang Chauhan
Publication year - 2021
Publication title -
journal of physics. conference series
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.21
H-Index - 85
eISSN - 1742-6596
pISSN - 1742-6588
DOI - 10.1088/1742-6596/1973/1/012008
Subject(s) - computer science , android (operating system) , optical character recognition , upload , artificial intelligence , thresholding , hindi , natural language processing , ascii , speech recognition , world wide web , image (mathematics) , programming language , operating system
The proposed Implementation is on the Android Application to extract using Tesseract OCR in which the following concepts will be used, which are Adaptive Thresholding, Connected Component, Fine Lines, and Recognize Word. Using this Optical Character Recognition (OCR) Technology, an Application generated text which is printed on a clean, B/W or colourful background can be converted into a computer readable form ASCII. With the help of this Android Application using Tesseract OCR, the system has two ways for Text Extraction. The first one is to capture a photo while the second one uploads an image from the gallery after that system can proceed for as per the user requirement which portion of the image they want to crop or edit. After editing the picture, it converts into the text. This Android Application is for two languages, English and Hindi.