z-logo
open-access-imgOpen Access
Generative imaging and image processing via generative encoder
Author(s) -
Yong Zheng Ong,
Haizhao Yang
Publication year - 2022
Publication title -
inverse problems and imaging
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.755
H-Index - 40
eISSN - 1930-8345
pISSN - 1930-8337
DOI - 10.3934/ipi.2021060
Subject(s) - inpainting , generative grammar , generative model , mathematics , encoder , algorithm , image (mathematics) , artificial intelligence , deblurring , computer science , image processing , statistics , image restoration
This paper introduces a novel generative encoder (GE) framework for generative imaging and image processing tasks like image reconstruction, compression, denoising, inpainting, deblurring, and super-resolution. GE unifies the generative capacity of GANs and the stability of AEs in an optimization framework instead of stacking GANs and AEs into a single network or combining their loss functions as in existing literature. GE provides a novel approach to visualizing relationships between latent spaces and the data space. The GE framework is made up of a pre-training phase and a solving phase. In the former, a GAN with generator \begin{document}$ G $\end{document} capturing the data distribution of a given image set, and an AE network with encoder \begin{document}$ E $\end{document} that compresses images following the estimated distribution by \begin{document}$ G $\end{document} are trained separately, resulting in two latent representations of the data, denoted as the generative and encoding latent space respectively. In the solving phase, given noisy image \begin{document}$ x = \mathcal{P}(x^*) $\end{document} , where \begin{document}$ x^* $\end{document} is the target unknown image, \begin{document}$ \mathcal{P} $\end{document} is an operator adding an addictive, or multiplicative, or convolutional noise, or equivalently given such an image \begin{document}$ x $\end{document} in the compressed domain, i.e., given \begin{document}$ m = E(x) $\end{document} , the two latent spaces are unified via solving the optimization problem\begin{document}$ z^* = \underset{z}{\mathrm{argmin}} \|E(G(z))-m\|_2^2+\lambda\|z\|_2^2 $\end{document}and the image \begin{document}$ x^* $\end{document} is recovered in a generative way via \begin{document}$ \hat{x}: = G(z^*)\approx x^* $\end{document} , where \begin{document}$ \lambda>0 $\end{document} is a hyperparameter. The unification of the two spaces allows improved performance against corresponding GAN and AE networks while visualizing interesting properties in each latent space.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here