Error Analysis of SaHiT - A Statistical Sanskrit-Hindi Translator
Author(s) -
Rajneesh Kumar Pandey,
Girish Nath Jha
Publication year - 2016
Publication title -
procedia computer science
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.334
H-Index - 76
ISSN - 1877-0509
DOI - 10.1016/j.procs.2016.08.114
Subject(s) - sanskrit , hindi , computer science , natural language processing , artificial intelligence , machine translation , linguistics , philosophy
The paper shows a statistical Sanskrit-Hindi Translator and analyzes the errors being generated by the system. The System is being trained simultaneously on the platform - the Microsoft Translator Hub (MTHub) and is intended only for simple Sanskrit prose texts. The training set includes 24K parallel sentences and 25k monolingual data with recent BLEU (Bilingual Evaluation Understudy) scores of 41 and above. The paper discusses the errors analysis of the system and suggests possible solutions. Further, it also focuses on the evaluation of MTHub system with BLEU metrics. For developing MT systems, the parallel Sanskrit-Hindi text corpora has been collected or developed manually from the literature, health, news and tourism domains. The paper also discusses issues and challenges in the development of translation systems for languages like Sanskrit
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom