An empirical evaluation of phrase-based statistical machine translation for Indonesia slang-word translator | Zendy

Kyrie Cettyara Eleison | Zendy; Sari Uli Inggrid Hutahaean | Zendy; Sarah Christine Tampubolon | Zendy; Teamsar Muliadi Panggabean | Zendy; Ike Fitriyaningsih | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

An empirical evaluation of phrase-based statistical machine translation for Indonesia slang-word translator

Author(s) -

Kyrie Cettyara Eleison,

Sari Uli Inggrid Hutahaean,

Sarah Christine Tampubolon,

Teamsar Muliadi Panggabean,

Ike Fitriyaningsih

Publication year - 2022

Publication title -

indonesian journal of electrical engineering and computer science

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.241

H-Index - 17

eISSN - 2502-4760

pISSN - 2502-4752

DOI - 10.11591/ijeecs.v25.i3.pp1803-1813

Subject(s) - computer science , machine translation , slang , phrase , natural language processing , artificial intelligence , word (group theory) , benchmark (surveying) , span (engineering) , machine translation software usability , example based machine translation , linguistics , engineering , philosophy , civil engineering , geodesy , geography

The use of slang (non-standard language), especially in social media, is increasing. It causes reducing the level of understanding when communicating because not everyone understands slang (non-standard language). The purpose of this work is to develop a slang-word translator. The other objective is to find the minimum number of sentences and BiLingual Evaluation Understudy (BLEU) score used as a benchmark to determine that the translation is understandable. The approach used in this project is a Phrase-based statistical machine translation (PBSMT) approach, suitable for low resource language, with a dataset of 100,000 sentences taken from the comments column of several online political news portals. The comments are then manually translated to produce a parallel corpus of non-standard language-standard language. The sample sentences are taken from the dataset then distributed using questionnaires to obtain the human understanding level regarding the translation result. The result of the implementation is a BLEU score of 64 and the minimum number of sentences to have an understandable machine translation is 500. The conclusion drawn from the distributed questionnaires is that humans can understand the sentences produced by the translation machine.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore