Digital twins, a detailed 3D replicas of real-world environments, are becoming essential across industries:
- 🏗️ Construction teams use them for planning, monitoring progress, and clash detection.
- 🏙️ Urban planners rely on them to simulate infrastructure changes and improve city design.
- 🏠 Real estate professionals showcase properties through immersive, interactive experiences.
- 🏭 Manufacturing and robotics use them for simulation, inspection, and training in virtual environments.
These applications demand accurate, photorealistic reconstructions, not just point clouds or rough meshes, but dynamic, view-dependent representations of the real world. As the demand for spatial understanding grows, so does the need for accessible, high-quality 3D capture pipelines.
Recent advancements in view-synthesizing technologies, such as Neural Radiance Fields (NeRFs) and Gaussian Splatting, have significantly enhanced the realism and interactivity of 3D environment rendering. These methods enable photorealistic, view-dependent visualization, making them promising tools for digital twin creation and immersive scene reconstruction.
However, capturing high-quality training data for these methods remains a major challenge:
- Data Quality Issues: Real-world image and video data often suffer from problems like insufficient scene coverage, poor lighting, reflections, shadows, and occlusions, which negatively affect training outcomes.
- Lack of User Guidance: Novice users typically lack best-practice knowledge regarding capture angles, camera settings, and lighting conditions, leading to suboptimal datasets.
- Scale and Coverage: Collecting comprehensive data for large environments usually involves multiple video captures. Ensuring complete coverage without gaps or redundancies is both time-consuming and difficult
- Dynamic Elements: The presence of moving objects, such as people or vehicles, introduces additional complexity during both capture and reconstruction.
As a result, users often need to iterate multiple times before achieving satisfactory results.
Project Objectives
This project aims to address these challenges through the following goals:
- Identifying and modeling shortcomings in image/ video data towards training High-fidelity Gaussian Splats
- Real-Time User Feedback for Optimal Data Collection
- Assistance for Data Collection via Multiple Videos
- Identification of Dynamic Objects and Interactive Dataset Optimization.