A Robust Script Identification System for Historical Indian Document Images
Author(s) -
S. Kavitha,
Palaiahnakote Shivakumara,
G. Hemantha Kumar,
C.L. Tan
Publication year - 2015
Publication title -
malaysian journal of computer science
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.197
H-Index - 18
ISSN - 0127-9084
DOI - 10.22452/mjcs.vol28no4.2
Subject(s) - telugu , scripting language , computer science , hindi , optical character recognition , identification (biology) , artificial intelligence , lexicographical order , historical document , gujarati , tamil , natural language processing , character (mathematics) , similarity (geometry) , information retrieval , decipherment , pattern recognition (psychology) , image (mathematics) , linguistics , philosophy , botany , geometry , mathematics , combinatorics , biology , operating system
Automatic script identification in archives of documents is essential for searching a specific document in order to choose an appropriate Optical Character Recognizer (OCR) for recognition. Besides, identification of one of the oldest historical documents such as Indus scripts is challenging and interesting because of inter script similarities. In this work, we propose a new robust script identification system for Indian scripts that includes Indus documents and other scripts, namely, English, Kannada, Tamil, Telugu, Hindi and Gujarati which helps in selecting an appropriate OCR for recognition. The proposed system explores the spatial relationship between dominant points,namely, intersection points, end points and junction points of the connected components in the documents to extract the structure of the components. The degree of similarity between the scripts is studied by computing the variances of the proximity matrices of dominant points of the respective scripts. The method is evaluated on 700 scanned document images. Experimental results show that the proposed system outperforms the existing methods in terms of classification rate.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom