A Chunk-based n-gram English to Thai Transliteration
Author(s) -
Wirote Aroonmanakun
Publication year - 1970
Publication title -
ecti transactions on computer and information technology (ecti-cit)
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.132
H-Index - 2
ISSN - 2286-9131
DOI - 10.37936/ecti-cit.200622.53286
Subject(s) - transliteration , computer science , pronunciation , grapheme , n gram , artificial intelligence , natural language processing , speech recognition , language model , linguistics , philosophy , graphene , physics , quantum mechanics
In this study, a chunk-based n-gram model is proposed for English to Thai transliteration. The model is compared with three other models: table lookup model, decision tree model, and statistical model. The chunk-based n-gram model achieves 67% word accuracy, which is higher than the accuracy of other models. Performances of all models are slightly increased when an English grapheme to phoneme is included in the system. However, the accuracy of the system does not suffice to be a public transliteration tool. The low accuracy of the system is caused by the poor performance of the English grapheme to phoneme module and the inconsistency of pronunciation in the training data. Some suggestions are provided for further improvement.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom