In this page, we briefly introduce our work in Fall 2023.
Title:
Instance-level image warping for domain adaptation
Links:
Contributions:
- Introduce a Instance-Level Image Warping to address scale bias in driving.
- Incorporate image warping and feature unwarping into domain adaptation.
- Task-agnostic and plug-and-play during training; no warping during testing.
Motivation
Non-uniform Instance-level Image Warp. Bounding boxes mark small (red), medium (green) and large (blue) objects in the image. Object Scale bias challenges contemporal visual recognition systems, whereas our instance-level image warping oversample the medium and large objects to effectively addressing that object scale bias.
![](https://mscvprojects.ri.cmu.edu/f23team20/wp-content/uploads/sites/97/2023/12/image-19.png)
Object Size Distribution Shift and AP Improvement
Left: Shifting the distribution rightward, we reduce # of small objects while increasing # of medium and large objects. Right: Fewer small objects has minimal impacts, whereas more medium and large object brings significant improvements.
![](https://mscvprojects.ri.cmu.edu/f23team20/wp-content/uploads/sites/97/2023/12/image-20-1024x380.png)
State-of-the-Art Performances
![](https://mscvprojects.ri.cmu.edu/f23team20/wp-content/uploads/sites/97/2023/12/image-30-1024x456.png)
Workflow
Instance-level Image Warping for Domain Adaptation. For the source domain stage, we generate a saliency map based on source bounding bboxes and unwarp the features before computing the supervised loss. For the target domain stage, we do not use warp or unwarp.
![](https://mscvprojects.ri.cmu.edu/f23team20/wp-content/uploads/sites/97/2023/12/image-18-1024x393.png)
Mathematical Details
![](https://mscvprojects.ri.cmu.edu/f23team20/wp-content/uploads/sites/97/2023/12/image-29.png)
Supervised Semantic Segmentation (Cityscapes)
![](https://mscvprojects.ri.cmu.edu/f23team20/wp-content/uploads/sites/97/2023/12/image-22-1024x304.png)
![](https://mscvprojects.ri.cmu.edu/f23team20/wp-content/uploads/sites/97/2023/12/image-26-1024x168.png)
Domain Adaptive Semantic Segmentation (Cityscapes -> ACDC)
![](https://mscvprojects.ri.cmu.edu/f23team20/wp-content/uploads/sites/97/2023/12/image-23-1024x335.png)
![](https://mscvprojects.ri.cmu.edu/f23team20/wp-content/uploads/sites/97/2023/12/image-27-1024x283.png)
Domain Adaptive Object Detection (BDD100k day -> night)
![](https://mscvprojects.ri.cmu.edu/f23team20/wp-content/uploads/sites/97/2023/12/image-25-1024x336.png)
![](https://mscvprojects.ri.cmu.edu/f23team20/wp-content/uploads/sites/97/2023/12/image-28.png)
Referneces
[1] Kennerley, Mikhail, et al. “2PCNet: Two-Phase Consistency Training for Day-to-Night Unsupervised Domain Adaptive Object Detection.” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023.
[2] Hoyer, Lukas, Dengxin Dai, and Luc Van Gool. “Daformer: Improving network architectures and training strategies for domain-adaptive semantic segmentation.” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022.
[3] Xie, Enze, et al. “SegFormer: Simple and efficient design for semantic segmentation with transformers.” Advances in Neural Information Processing Systems 34 (2021): 12077-12090.
[4] Cordts, Marius, et al. “The cityscapes dataset for semantic urban scene understanding.” Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
[5] Sakaridis, Christos, Dengxin Dai, and Luc Van Gool. “ACDC: The adverse conditions dataset with correspondences for semantic driving scene understanding.” Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021.
[6] Sakaridis, Christos, Dengxin Dai, and Luc Van Gool. “Guided curriculum model adaptation and uncertainty-aware evaluation for semantic nighttime image segmentation.” Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019.
[7] Thavamani, Chittesh, et al. “Fovea: Foveated image magnification for autonomous navigation.” Proceedings of the IEEE/CVF international conference on computer vision. 2021.
[8] Ghosh, Anurag, et al. “Learned Two-Plane Perspective Prior based Image Resampling for Efficient Object Detection.” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023.
[9] Thavamani, Chittesh, et al. “Learning to Zoom and Unzoom.” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023.
[10] Yu, Fisher, et al. “Bdd100k: A diverse driving dataset for heterogeneous multitask learning.” Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020.