Motivation

Digital twins, a detailed 3D replicas of real-world environments, are becoming essential across industries:

  • 🏗️ Construction teams use them for planning, monitoring progress, and clash detection.
  • 🏙️ Urban planners rely on them to simulate infrastructure changes and improve city design.
  • 🏠 Real estate professionals showcase properties through immersive, interactive experiences.
  • 🏭 Manufacturing and robotics use them for simulation, inspection, and training in virtual environments.

These applications demand accurate, photorealistic reconstructions, not just point clouds or rough meshes, but dynamic, view-dependent representations of the real world. As the demand for spatial understanding grows, so does the need for accessible, high-quality 3D capture pipelines.

Recent advancements in view-synthesizing technologies, such as Neural Radiance Fields (NeRFs) and Gaussian Splatting, have significantly enhanced the realism and interactivity of 3D environment rendering. These methods enable photorealistic, view-dependent visualization, making them promising tools for digital twin creation and immersive scene reconstruction.

However, capturing high-quality training data for these methods remains a major challenge:

  • Data Quality Issues: Real-world image and video data often suffer from problems like insufficient scene coverage, poor lighting, reflections, shadows, and occlusions, which negatively affect training outcomes.
  • Lack of User Guidance: Novice users typically lack best-practice knowledge regarding capture angles, camera settings, and lighting conditions, leading to suboptimal datasets.
  • Scale and Coverage: Collecting comprehensive data for large environments usually involves multiple video captures. Ensuring complete coverage without gaps or redundancies is both time-consuming and difficult
  • Dynamic Elements: The presence of moving objects, such as people or vehicles, introduces additional complexity during both capture and reconstruction.

As a result, users often need to iterate multiple times before achieving satisfactory results.


Project Objectives

This project aims to address these challenges through the following goals:

  1. Identifying and modeling shortcomings in image/ video data towards training High-fidelity Gaussian Splats
  2. Real-Time User Feedback for Optimal Data Collection
  3. Assistance for Data Collection via Multiple Videos
  4. Identification of Dynamic Objects and Interactive Dataset Optimization.