Sentence‐Chain Based Seq2seq Model for Corpus Expansion | Zendy

Chung Euisok | Zendy; Park Jeon Gue | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Sentence‐Chain Based Seq2seq Model for Corpus Expansion

Author(s) -

Chung Euisok,

Park Jeon Gue

Publication year - 2017

Publication title -

etri journal

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.295

H-Index - 46

eISSN - 2233-7326

pISSN - 1225-6463

DOI - 10.4218/etrij.17.0116.0074

Subject(s) - perplexity , computer science , sentence , language model , natural language processing , artificial intelligence , recurrent neural network , encoder , artificial neural network , sequence (biology) , speech recognition , biology , genetics , operating system

This study focuses on a method for sequential data augmentation in order to alleviate data sparseness problems. Specifically, we present corpus expansion techniques for enhancing the coverage of a language model. Recent recurrent neural network studies show that a seq2seq model can be applied for addressing language generation issues; it has the ability to generate new sentences from given input sentences. We present a method of corpus expansion using a sentence‐chain based seq2seq model. For training the seq2seq model, sentence chains are used as triples. The first two sentences in a triple are used for the encoder of the seq2seq model, while the last sentence becomes a target sequence for the decoder. Using only internal resources, evaluation results show an improvement of approximately 7.6% relative perplexity over a baseline language model of Korean text. Additionally, from a comparison with a previous study, the sentence chain approach reduces the size of the training data by 38.4% while generating 1.4‐times the number of n‐grams with superior performance for English text.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore