Open Access
Recognition of Nastaliq Urdu Text using Multi-SVM
Author(s) -
Herleen Kour,
Mehvish Yasin,
Naveen Kumar Gondhi
Publication year - 2020
Publication title -
international journal of recent technology and engineering
Language(s) - English
Resource type - Journals
ISSN - 2277-3878
DOI - 10.35940/ijrte.e6949.018520
Subject(s) - urdu , cursive , computer science , artificial intelligence , centroid , support vector machine , speech recognition , pattern recognition (psychology) , segmentation , feature (linguistics) , field (mathematics) , mathematics , linguistics , philosophy , pure mathematics
Optical Character Recognition has emerged as an attractive research field nowadays. Lot of work has been done in Urdu script based on various approaches and diverse methodologies have been put forward based on Nastaliq font style. Urdu is written diagonally from top to bottom, the style known as Nastaliq. This feature of Nastaliq makes Urdu highly cursive and more sensitive leading to a difficult recognition problem. Due to the peculiarities of Nastaliq Style of writing, we have chosen ligature as a basic unit of recognition in order to reduce the complexity of system. The accuracy rate of recognizing ligature in Urdu text corresponds to the efficiency with which the ligatures are segmented. In addition to extracting connected components, the ligature segmentation takes into consideration various factors like baseline information, height, width, and centroid. In this paper ligature Recognition is performed by using multi-SVM (Sup-port Vector Machine) approach which gives an accuracy of 97% when 903 text images are fed to it.