CPGAN : An Efficient Architecture Designing for Text-to-Image Generative Adversarial Networks Based on Canonical Polyadic Decomposition
Author(s) -
Ruixin Ma,
Junying Lou
Publication year - 2021
Publication title -
scientific programming
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.269
H-Index - 36
eISSN - 1875-919X
pISSN - 1058-9244
DOI - 10.1155/2021/5573751
Subject(s) - computer science , image (mathematics) , autoencoder , architecture , decomposition , generative grammar , process (computing) , quality (philosophy) , artificial intelligence , stability (learning theory) , network architecture , image quality , theoretical computer science , deep learning , machine learning , programming language , computer network , art , ecology , philosophy , epistemology , visual arts , biology
Text-to-image synthesis is an important and challenging application of computer vision. Many interesting and meaningful text-to-image synthesis models have been put forward. However, most of the works pay attention to the quality of synthesis images, but rarely consider the size of these models. Large models contain many parameters and high delay, which makes it difficult to be deployed on mobile applications. To solve this problem, we propose an efficient architecture CPGAN for text-to-image generative adversarial networks (GAN) based on canonical polyadic decomposition (CPD). It is a general method to design the lightweight architecture of text-to-image GAN. To improve the stability of CPGAN, we introduce conditioning augmentation and the idea of autoencoder during the training process. Experimental results prove that our architecture CPGAN can maintain the quality of generated images and reduce at least 20% parameters and flops.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom