Generative imaging and image processing via generative encoder
Author(s) -
Yong Zheng Ong,
Haizhao Yang
Publication year - 2021
Publication title -
inverse problems and imaging
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.755
H-Index - 40
eISSN - 1930-8345
pISSN - 1930-8337
DOI - 10.3934/ipi.2021060
Subject(s) - inpainting , generative grammar , encoder , mathematics , generative model , image (mathematics) , algorithm , computer science , artificial intelligence , statistics
This paper introduces a novel generative encoder (GE) framework for generative imaging and image processing tasks like image reconstruction, compression, denoising, inpainting, deblurring, and super-resolution. GE unifies the generative capacity of GANs and the stability of AEs in an optimization framework instead of stacking GANs and AEs into a single network or combining their loss functions as in existing literature. GE provides a novel approach to visualizing relationships between latent spaces and the data space. The GE framework is made up of a pre-training phase and a solving phase. In the former, a GAN with generator \begin{document}$ G $\end{document} capturing the data distribution of a given image set, and an AE network with encoder \begin{document}$ E $\end{document} that compresses images following the estimated distribution by \begin{document}$ G $\end{document} are trained separately, resulting in two latent representations of the data, denoted as the generative and encoding latent space respectively. In the solving phase, given noisy image \begin{document}$ x = \mathcal{P}(x^*) $\end{document} , where \begin{document}$ x^* $\end{document} is the target unknown image, \begin{document}$ \mathcal{P} $\end{document} is an operator adding an addictive, or multiplicative, or convolutional noise, or equivalently given such an image \begin{document}$ x $\end{document} in the compressed domain, i.e., given \begin{document}$ m = E(x) $\end{document} , the two latent spaces are unified via solving the optimization problem\begin{document}$ z^* = \underset{z}{\mathrm{argmin}} \|E(G(z))-m\|_2^2+\lambda\|z\|_2^2 $\end{document}and the image \begin{document}$ x^* $\end{document} is recovered in a generative way via \begin{document}$ \hat{x}: = G(z^*)\approx x^* $\end{document} , where \begin{document}$ \lambda>0 $\end{document} is a hyperparameter. The unification of the two spaces allows improved performance against corresponding GAN and AE networks while visualizing interesting properties in each latent space.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom