Research Library

open-access-imgOpen AccessUpFusion: Novel View Diffusion from Unposed Sparse View Observations
Author(s)
Bharath Raj Nagoor Kani,
Hsin-Ying Lee,
Sergey Tulyakov,
Shubham Tulsiani
Publication year2024
We propose UpFusion, a system that can perform novel view synthesis and infer3D representations for an object given a sparse set of reference images withoutcorresponding pose information. Current sparse-view 3D inference methodstypically rely on camera poses to geometrically aggregate information frominput views, but are not robust in-the-wild when such information isunavailable/inaccurate. In contrast, UpFusion sidesteps this requirement bylearning to implicitly leverage the available images as context in aconditional generative model for synthesizing novel views. We incorporate twocomplementary forms of conditioning into diffusion models for leveraging theinput views: a) via inferring query-view aligned features using a scene-leveltransformer, b) via intermediate attentional layers that can directly observethe input image tokens. We show that this mechanism allows generatinghigh-fidelity novel views while improving the synthesis quality givenadditional (unposed) images. We evaluate our approach on the Co3Dv2 and GoogleScanned Objects datasets and demonstrate the benefits of our method overpose-reliant sparse-view methods as well as single-view methods that cannotleverage additional views. Finally, we also show that our learned model cangeneralize beyond the training categories and even allow reconstruction fromself-captured images of generic objects in-the-wild.
Language(s)English

Seeing content that should not be on Zendy? Contact us.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here