Introduction

Thermal infrared (TIR) imaging from unmanned aerial vehicles (UAVs) offers an all weather, none contact modality essential for applications such as wildfire monitoring, industrial safety inspections, and nighttime search and rescue operations. Unlike visible light imagery whose reconstruction pipelines falter under low illumination, haze, or fog, TIR data remains robust across diverse environmental conditions. Yet, despite advances in drone deployment and IR sensor miniaturization, most thermal footage is confined to two dimensional video streams, limiting its utility for spatial analysis, volumetric mapping, and temporal monitoring.

Reconstructing accurate 3D thermal scenes from UAV captures involves several core challenges:

First, low spatial resolution and sensor noise in raw IR frames obscure fine geometric structure and blur object boundaries, as seen in Figure 1.

Figure 1: IR Frames are Noisy and have Low Resolution.

Second, feature scarcity the lack of rich texture and contrast in thermal images impedes reliable cross view correspondence and alignment.

Third, coverage gaps and drift arise when stitching multiple drone passes, yielding holes and misregistrations that degrade global consistency, as seen in Figure 2.

Figure 2: Alignment drift, might require multiple passes.

Finally, dynamic thermal phenomena, such as spreading fire fronts or moving heat sources, violate static scene assumptions and introduce temporal artifacts in naïve reconstructions.

To overcome these obstacles, we propose a 3D Gaussian Splatting Thermal Reconstruction Pipeline, a unified, real‑time pipeline for 3D thermal scene reconstruction. Our framework comprises three key modules:

  1. Thermal image enhancement, employing adaptive edge preserving filtering and learned priors to sharpen boundaries and restore detail in raw IR frames.
  2. Physics induced Gaussian splatting, which models atmospheric transmission variability and inter object thermal conduction by fitting learnable 3D Gaussians with neural predicted physical parameters.

By investigating and integrating these components, we aim to transform disjoint 2D thermal video into coherent volumetric heat maps enabling precise, real‑time 3D thermal reconstructions even in feature poor and dynamic environments.