z-logo
open-access-imgOpen Access
Do Not Have Enough Data? Deep Learning to the Rescue!
Author(s) -
Ateret Anaby-Tavor,
Boaz Carmeli,
Esther Goldbraich,
Amir Kantor,
George Kour,
Segev Shlomov,
Naama Tepper,
Naama Zwerdling
Publication year - 2020
Publication title -
proceedings of the aaai conference on artificial intelligence
Language(s) - English
Resource type - Journals
eISSN - 2374-3468
pISSN - 2159-5399
DOI - 10.1609/aaai.v34i05.6233
Subject(s) - computer science , artificial intelligence , classifier (uml) , machine learning , labeled data , task (project management) , artificial neural network , training set , language model , class (philosophy) , deep learning , natural language processing , management , economics
Based on recent advances in natural language modeling and those in text generation capabilities, we propose a novel data augmentation method for text classification tasks. We use a powerful pre-trained neural network model to artificially synthesize new labeled data for supervised learning. We mainly focus on cases with scarce labeled data. Our method, referred to as language-model-based data augmentation (LAMBADA), involves fine-tuning a state-of-the-art language generator to a specific task through an initial training phase on the existing (usually small) labeled data. Using the fine-tuned model and given a class label, new sentences for the class are generated. Our process then filters these new sentences by using a classifier trained on the original data. In a series of experiments, we show that LAMBADA improves classifiers' performance on a variety of datasets. Moreover, LAMBADA significantly improves upon the state-of-the-art techniques for data augmentation, specifically those applicable to text classification tasks with little data.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom