Baseline
Key Points Detection
- Face key points detection
- Convolutional Pose Machine
- Ear key points detection
- YOLO + Convolutional Pose Machine
3DMM Fitting
Pipeline
- 2d keypoints K1 (including ears) on Rgb video from OpenPose and CPM, GT depth d1
- 3dmm (learned) + R, T (learned) + camera intrinsics-> 2d keypoints K2, depth d2
- Loss between K1 and K2, d1 and d2
- Depth loss
- Fix the correspondence between mesh vertices and the pixel in the video
- Iterate until the depth loss under the correspondence coverage and then update the correspondence —> increase the stability of the model