
Generative imaging and image processing via generative encoder
Author(s) -
Yong Zheng Ong,
Haizhao Yang
Publication year - 2022
Publication title -
inverse problems and imaging
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.755
H-Index - 40
eISSN - 1930-8345
pISSN - 1930-8337
DOI - 10.3934/ipi.2021060
Subject(s) - inpainting , generative grammar , generative model , mathematics , encoder , algorithm , image (mathematics) , artificial intelligence , deblurring , computer science , image processing , statistics , image restoration
This paper introduces a novel generative encoder (GE) framework for generative imaging and image processing tasks like image reconstruction, compression, denoising, inpainting, deblurring, and super-resolution. GE unifies the generative capacity of GANs and the stability of AEs in an optimization framework instead of stacking GANs and AEs into a single network or combining their loss functions as in existing literature. GE provides a novel approach to visualizing relationships between latent spaces and the data space. The GE framework is made up of a pre-training phase and a solving phase. In the former, a GAN with generator \begin{document}$ G $\end{document} capturing the data distribution of a given image set, and an AE network with encoder \begin{document}$ E $\end{document} that compresses images following the estimated distribution by \begin{document}$ G $\end{document} are trained separately, resulting in two latent representations of the data, denoted as the generative and encoding latent space respectively. In the solving phase, given noisy image \begin{document}$ x = \mathcal{P}(x^*) $\end{document} , where \begin{document}$ x^* $\end{document} is the target unknown image, \begin{document}$ \mathcal{P} $\end{document} is an operator adding an addictive, or multiplicative, or convolutional noise, or equivalently given such an image \begin{document}$ x $\end{document} in the compressed domain, i.e., given \begin{document}$ m = E(x) $\end{document} , the two latent spaces are unified via solving the optimization problem\begin{document}$ z^* = \underset{z}{\mathrm{argmin}} \|E(G(z))-m\|_2^2+\lambda\|z\|_2^2 $\end{document}and the image \begin{document}$ x^* $\end{document} is recovered in a generative way via \begin{document}$ \hat{x}: = G(z^*)\approx x^* $\end{document} , where \begin{document}$ \lambda>0 $\end{document} is a hyperparameter. The unification of the two spaces allows improved performance against corresponding GAN and AE networks while visualizing interesting properties in each latent space.