
Lifting Deep Image Denoisers to Video with Frame Interpolation Pre-training
Author(s) -
Piotr Kopa Ostrowski,
Daniel Wesierski,
Anna Jezierska,
Tomasz Stefanski
Publication year - 2025
Publication title -
ieee transactions on circuits and systems for video technology
Language(s) - English
Resource type - Magazines
SCImago Journal Rank - 0.873
H-Index - 168
eISSN - 1558-2205
pISSN - 1051-8215
DOI - 10.1109/tcsvt.2025.3575717
Subject(s) - components, circuits, devices and systems , communication, networking and broadcast technologies , computing and processing , signal processing and analysis
We introduce Frame Interpolation Pre-training (FIP), a simple learning technique for lifting deep image denoisers to video denoising with improved implicit temporal alignment. Modern video denoising networks typically rely on explicit motion estimation and alignment which are computationally intensive and harder to re-design and re-train, restricting their application scope and usability. Conversely, stacking frames and image denoisers, without incorporating explicit motion estimation modules, improves speed and benefits from a simpler design, thereby facilitating their generalizability to the video domain. However, it leads to lower accuracy due to suboptimal capture of temporal dependencies. To better leverage the adjacent frames in this setting and reduce the accuracy gap, we propose a novel training regime that divides the standard supervised training of the denoising task into two phases. In the initial phase, FIP guides the network to interpolate a fully masked central frame using only adjacent noisy input frames. In the subsequent phase, the pre-trained network is fine-tuned on denoising the central frame, now using all noisy input frames. Extensive diagnostics indicate that FIP-based networks provide better implicit motion estimation and temporal alignment. In effect, qualitative and quantitative evaluation on standard video denoising datasets with synthetic and real noise demonstrates that FIP consistently improves video denoising accuracy of motion-aware, video-lifted image denoisers without additional computational overhead during training and test time. Our code is available at https://github.com/camalab-ai/FIP.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom