
Multimodal supervised image translation
Author(s) -
Ruan Congcong,
Chen Dihu,
Hu Haifeng
Publication year - 2019
Publication title -
electronics letters
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.375
H-Index - 146
eISSN - 1350-911X
pISSN - 0013-5194
DOI - 10.1049/el.2018.6167
Subject(s) - computer science , translation (biology) , image (mathematics) , artificial intelligence , domain (mathematical analysis) , gaussian , task (project management) , probabilistic logic , image translation , machine learning , computer graphics , computer vision , pattern recognition (psychology) , mathematics , quantum mechanics , economics , biochemistry , messenger rna , gene , mathematical analysis , chemistry , physics , management
Multimodal image‐to‐image translation is a class of vision and graphics problems where the goal is to learn a one‐to‐many mapping between the source domain and target domain. Given an image in the source domain, the model aims to produce as many diverse results as possible. It is an important and challenging problem in the task of image translation. To this end, recent works utilise Gaussian vectors to produce diverse results but with a small difference. It is because of the special probabilistic nature of Gaussian distribution. In this work, the authors propose linearly distributed latent codes instead of conventional Gaussian vectors, which control the style of generated images. Taking advantage of linear distribution, their model can produce much more diverse results and outperform the state‐of‐the‐art baselines in terms of diversity. Qualitative and quantitative comparisons against baselines demonstrate the effectiveness and superiority of their method.