Key Points Detection

  • Face key points detection
    • Convolutional Pose Machine
  • Ear key points detection
    • YOLO + Convolutional Pose Machine
Left: face key points, right: ear key points

3DMM Fitting


  • 2d keypoints K1 (including ears) on Rgb video from OpenPose and CPM, GT depth d1
  • 3dmm (learned) + R, T (learned) + camera intrinsics-> 2d keypoints K2, depth d2
  • Loss between K1 and K2, d1 and d2
  • Depth loss
    • Fix the correspondence between mesh vertices and the pixel in the video
    • Iterate until the depth loss under the correspondence coverage and then update the correspondence —> increase the stability of the model
Blue key points: video, red key points: mesh