Related Work

1. Co-evolution of pose and mesh for 3d human body estimation from video.

    Key Insight: Human mesh recovery is not just about using 3D keypoints (Pose) or 3D shapes (Mesh) alone—it’s about joint collaboration between them.

    • Decoupling and co-evolution of pose estimation and mesh prediction:

    Poses provide information about human motion; meshes provide information about body shape

    2. World-Grounded Human Motion Recovery via Gravity-View Coordinates.

    Key Insight: Use a Gravity-View (GV) Coordinate system to infer per-frame human motion, enabling robust result of world-grounded HMR from video.

    • Aligning with gravity and camera view direction
    • GV Coordinate system eliminates inconsistencies between coordinate systems of different frames.

    3. CameraHMR: Aligning People with Perspective.

    Key Insight: Integrate the predicted camera field of view (FoV) into the reconstruction pipeline, improves HMR in monocular images with severe perspective distortion.

    • HumanFoV: Predicts the FoV directly from the input image.
    • CamSMPLify: Incorporates the predicted FoV into a full perspective camera model, replacing the traditional weak-perspective assumption.
    • CameraHMR: Improves the original HMR2.0 architecture by integrating the camera intrinsics predicted by HumanFoV.

    [References]

    1. You, Yingxuan, et al. “Co-evolution of pose and mesh for 3d human body estimation from video.” Proceedings of the IEEE/CVF International Conference on Computer Vision. 2023.
    2. Shen, Zehong, et al. “World-Grounded Human Motion Recovery via Gravity-View Coordinates.” SIGGRAPH Asia 2024 Conference Papers. 2024.
    3. Patel, Priyanka, and Michael J. Black. “CameraHMR: Aligning People with Perspective.” arXiv preprint arXiv:2411.08128 (2024).