Mugsy is a High-end Multi-view Capturing System that helps capture synchronized multi-view videos of facial expressions exhibited by subjects inside the dome. The result of this large-scale camera setup is the dataset, Multiface. It consists of high-quality recordings of the faces of 13 identities, each captured in a multi-view capture stage performing various facial expressions. An average of 12,200 (left) to 23,000 (right) frames per subject were captured at 30 fps.

Applications: Photo-realistic human face rendering and reconstruction

However, calibrating the cameras on such a large-scale using external calibration objects is time-consuming. Currently, these images are calibrated using an icosahedron with deltille grids pasted on top of its faces.

Problem Statement

  • Build an auto-calibration system capable of determining internal camera parameters directly from multiple uncalibrated images. It is assumed that the intrinsic parameters of the camera change with respect to the subject, however the extrinsic, i.e., the position and locations of the cameras in the dome do not undergo major changes.
  • Integrate the above system into the Structure from Motion (SfM) pipeline (presently, we are using Colmap) to achieve better calibration precision from various optimization tricks (eg. bundle adjustment) when compared to computing calibration data using a special calibration object.

We plan to use only the data that is available to us (multi-view capture raw image data) to determine the intrinsic parameters of the large-scale camera setup of a mugsy.