z-logo
open-access-imgOpen Access
Efek Peningkatan Jumlah Paralel Korpus Pada Penerjemahan Kalimat Bahasa Indonesia ke Bahasa Lampung Dialek Api
Author(s) -
Permata Permata,
Zaenal Abidin,
Farida Ariyani
Publication year - 2020
Publication title -
jurnal komputasi/jurnal komputasi
Language(s) - English
Resource type - Journals
eISSN - 2541-0350
pISSN - 2541-0296
DOI - 10.23960/komputasi.v8i2.2613
Subject(s) - computer science , indonesian , natural language processing , vocabulary , artificial intelligence , preprocessor , speech recognition , linguistics , philosophy
Experimental observations of the effect of the number of parallel corpus on Indonesian translation into the Lampung dialect api were carried out using the statistical machine translation (SMT) method. SMT utilizes a parallel Indonesian corpus and its translation in the Lampung dialect api as a material for training data. The research strategy was carried out in three ways, namely first strategy with a corpus parallel number of 1000 sentences, the second strategy with a corpus parallel number of 2000 and the third strategy with a corpus parallel number of 3000 sentences. The research starts from the preprocessing phase followed by the training phase, namely the parallel corpus processing phase to obtain a language model and translation model. Then the testing phase, and ends with the evaluation phase. SMT testing uses 25 single sentences without out-of-vocabulary (OOV), 25 single sentences with OOV, 25 compound sentences without OOV and 25 compound sentences with OOV. The test results of translating Indonesian sentences into Lampung dialectic api are shown through the accuracy value of Bilingual Evaluation Undestudy (BLEU) obtained in testing 25 single sentences without out-of-vocabulary (OOV) in the first strategy, the second and the third are 21.49%, 59.58% and 73.21%. In testing 25 single sentences with out-of-vocabulary (OOV) obtained in the first strategy, the second and the third are 23.22%, 44.33% and 68.72%. In testing 25 compound sentences without out-of-vocabulary (OOV) obtained in the first strategy, the second and the third are 18.22%, 39.4% and 69.18%. In testing 25 compound sentences with out-of-vocabulary (OOV) obtained in the first strategy, the second and the third are 25.94%, 28.22% and 71.94%.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here