Image Restoration

Our image restoration pipeline aims to combine two main ideas to restore a degraded image in conjunction with improving perception performance. For the former, we use a transformer based model, TransWeather, and for the latter we take ideas from task-driven learning methods.

TransWeather [1] This model is built upon on a transformer architecture, taking in any weather-degraded image and outputting a weather-free image of the same scene. In addition to basic transformer blocks, intra-patch transformer blocks propagate patch-level features and information through the network. In theory, this model should be able to encompass all weather types, expanding upon previous works where unique encoder blocks are necessary. This is done so by increasing the number of parameters learned in the model. Furthermore, weather type queries are learned in the transformer decoders that attend to the weather type in the image.

TransWeather model architecture

Task-Driven Methods [2] Restoration on it’s own is not sufficient if there is no significant improvement in downstream perceptual tasks. Thus, during training, we incorporate the framework of [2] into our pipeline. The task-driven enhancement framework for restoration utilizes two models: the restoration network and a high-level vision task. In training, a recovery loss, feature identity loss, and high-level task loss are combined to optimize restoration and task improvement in parallel. The resulting model we use combines TransWeather with a 2D object detection task under this framework.

Framework for task-driven image enhancement [2]

Synthetic Image Generation

The image restoration network requires similarity score between a ground truth (clear) image and restored image of the same scene to assess the quality of the restoration. Thus, we augment the KITTI dataset [3] by generating synthetic rain and fog images. Synthetic fog generation follows the atmospheric scattering model [4]:

In the above equation, the foggy image I(x) is generated using the clear image J(x), transmission map t(x), atmospheric scattering A, scattering coefficient β, and depth map d(x). We apply varying values of β to the data.

Similarly, we produce synthetic rain streaks on images at varying values of speed in mm/hr using a combination of a GAN and physics-based rendering method [5].

Example synthetic fog and rain images

Weather Classification

There are two prediction heads to our weather classification task: (1) weather type classification and (2) degree. The model itself uses a ResNet50 backbone with engineered features corresponding to saturation, dark channel prior, and local contrast. These features are represented as binned histograms and concatenated at the linear layers to form the predictions. The former head is trained on real-world images with a cross entropy loss, while the latter is trained using both synthetic and real images, using corresponding β and rain mm/hr (normalized to [0,1]) as ground truth degree. The degree head is yet to be implemented.


[1] Valanarasu, J. M. J., Yasarla, R., & Patel, V. M. “Transweather: Transformer-based restoration of images degraded by adverse weather conditions.” CVPR 2022

[2] Lee, Younkwan, et al. “Task-driven deep image enhancement network for autonomous driving in bad weather.” IEEE 2021


[4] Narasimhan, Srinivasa G., and Shree K. Nayar. “Vision and the atmosphere.” International journal of computer vision 2002

[5] Tremblay, Maxime, et al. “Rain rendering for evaluating and improving robustness to bad weather.” International Journal of Computer Vision 2021