Methodology - End-to-End Infrared UAV Mapping: An Integrated Framework for Thermal Scene Reconstruction

We present a physics-guided, two-stage framework for dynamic thermal scene reconstruction from UAV infrared imagery using 3D Gaussian Splatting. The core design principle is to decouple static scene geometry from time-varying thermal dynamics: Stage 1 learns a stable 3D thermal representation at discrete timestamps, and Stage 2 models temperature evolution over time while keeping geometry fixed.

Given calibrated UAV thermal frames captured across a mission, our goal is to reconstruct a thermal scene that supports novel-view synthesis and time-dependent temperature estimation. Thermal imagery introduces challenges not present in RGB—view-dependent attenuation, conduction-induced edge blurring, and temporal thermal variation—which we address with physics-aware components integrated into a Gaussian splatting pipeline.

Thermal Image Formation Revisited

In contrast to visible light reconstruction, thermal image formation is heavily influenced by the temperature, which in a sense relies on dynamic atmospheric and material conditions. We model the observed thermal frame x_t,θ at time t and viewpoint θ as:

where f(X) is the canonical projection of an ideal, unperturbed thermal 3D scene, and P_t,_θ encodes viewpoint and time dependent physical distortions. To generate more accurate reconstructions, we seek to estimate the underlying 3D thermal state X^~ such that:

thereby correcting for the effects of environmental attenuation and surface level thermal diffusion.

Stage 1: Static Physics Guided Thermal 3D Reconstruction

1. Viewpoint Aware Attenuation Estimator (VAE)

The VAE module models the degradation of thermal intensity due to atmospheric interference. Inspired by classical radiative transfer principles, we represent thermal attenuation as:

where k denotes the attenuation factor, and d is the propagation path length.

We design a lightweight MLP that predicts the attenuation parameters (k(abs), k(sca), d) for each 3D Gaussian, conditioned on its spatial position and timestamp. This learned correction is applied directly to the Gaussian’s appearance encoding (e.g., SH coefficients), yielding:

This step compensates for the viewpoint specific drop in thermal radiance and significantly reduces ghosting and floaters.

2. Thermal Diffusion Correction Unit (TDCU)

To address the edge blurring caused by heat transfer between adjacent surfaces, we introduce the Thermal Diffusion Correction Unit (TDCU). In thermal imaging, heat is fundamentally a form of energy that naturally diffuses from hotter to cooler regions. As this energy propagates through conduction, it leads to smoothed temperature gradients across object boundaries. In practice, this manifests in thermal images as softened edges, where the temperature difference between adjacent surfaces becomes visually indistinct particularly problematic in scenes where fine boundaries carry semantic importance. To mitigate that, we draw from the heat equation to simulate spatial temperature smoothing:

where u represents the surface temperature field and α captures local diffusivity.

Instead of relying on fixed physical coefficients, the TDCU is implemented as a deep residual network that ingests both the original image and its second order spatial gradients. By learning a spatially adaptive correction mask, the TDCU selectively sharpens edges that have been degraded by thermal conduction while preserving coherent heat flow patterns.

3. Surface Smoothness Regularizer (SSR)

While real world thermal distributions tend to change smoothly, abrupt transitions or sharp corners often arise from modeling errors. We introduce the Surface Smoothness Regularizer (SSR) to penalize spurious discontinuities. Drawing on the Harris corner metric, we compute a spatial map of irregular regions and weight their contribution to the loss:

where C denotes the corner response, i is the current iteration, and L1 is the reconstruction error. This adaptive term progressively focuses the model on difficult regions during early training, encouraging structural coherence.

4. Full Optimization Objective

The total training loss combines three components: image fidelity, perceptual consistency, and structural smoothness:

where lambda_s and lambda_p are empirically set to 0.2. The model is optimized end-to-end using Adam, with separate learning rate schedules for the Gaussian parameters and the physics based modules.

**Figure 5:** **Stage 1: Physics guided static thermal 3D reconstruction.**. The system builds on 3D Gaussian Splatting (3DGS) and integrates three physics aware modules: the **Viewpoint Aware Attenuation Estimator (VAE)** which models atmospheric effects to correct view dependent attenuation. The **Thermal Diffusion Correction Unit (TDCU)** refines edge clarity by compensating for heat conduction induced blurring, and finally the **Surface Smoothness Regularizer (SSR)** enforces physically consistent temperature continuity by penalizing discontinuities. Together, these components improve fidelity and realism in novel view synthesis of thermal infrared scenes.

Stage 2: Dynamic Thermal Modeling via Physics Guided Temporal Evolution

Stage 2 extends the static reconstruction into a dynamic thermal model by predicting how temperature evolves through time while keeping geometry fixed. This matches the physical assumption that scene structure and appearance are static over short mission windows, while temperature changes.

Inputs

Frozen Gaussian geometry and SH appearance from Stage 1
Position encodings and time embeddings spanning intermediate timestamps
We also allow additional scene cues if available (e.g., semantic/geometric features) depending on the dataset, though our core pipeline remains SH-based

Thermal Property Estimation

We predict physically meaningful per-Gaussian thermal properties that govern heat transfer, including emissivity, heat transfer, and heat capacity. These quantities parameterize a thermodynamic update that explains observed cooling/heating patterns rather than fitting each timestep independently.

Temporal Integration

Given an initial temperature and predicted thermal properties, we compute per-step temperature variation and integrate across the discretized time interval to obtain Temp_integral(t). We supervise both instantaneous temperature predictions and integrated evolution to enforce consistency and physical plausibility.

Stage 2 Output

A time continuous temperature estimate that supports rendering thermal novel views at arbitrary timestamps.
Consistent thermal evolution over time, grounded by learned thermodynamic parameters.

Training and Rendering

Both stages rely on differentiable Gaussian rasterization for rendering. We optimize against observed thermal frames using reconstruction losses and physics-aware regularization. Stage 1 focuses on stable geometry and per-timestamp temperatures, while Stage 2 optimizes temperature evolution under physical constraints with geometry frozen.

Pipeline Overview

Stage 1 recovers a stable static thermal 3D scene and Temp(t) at observed timestamps.
Stage 2 models time-varying temperature using learned thermal properties and temporal integration.
Geometry and appearance remain fixed; only temperature evolves, enabling dynamic thermal reconstruction.