Why is Sparse-View 4D Reconstruction Hard?

Methodologies

Overview

As dynamic scene reconstruction from sparse views is extremely challenging, we present two key insights to initialize plausible geometry and motion:

Initializing consistent scene geometry via confidence-aware spatio-temporal alignment
Initializing motion trajectories by clustering per-point 3D semantic features distilled from 2D foundation models

Space-time consistent depth

Feature-based motion bases: