Problem Statement
Given monocular video as input, how can we recover 3D human mesh that is spatially consistent in the world coordinate system?
Copy and paste this URL into your WordPress site to embed
Copy and paste this code into your site to embed