Modern deep learning frameworks required intensive labor work to label collected data. In cases where we cannot get enough data or we cannot get enough labels, a common solution is to use synthetic data rendered by a graphics engine to pre-train the network. However, deep models generalize poorly on new domains. In this work we are solving the problem named “Unsupervised Domain Adaptation”. Specifically, we use labeled images rendered by GTA5 game and unlabeled real-world data to train a segmentation network that performs well on real-world data. This is very useful in self-driving or robotics as we can use millions of synthetic data to train our network and achieve good performance on real data. We focus on methodology based on CyCADA to address this problem. More details can be found in our posted presentations.
In addition to domain adaptation techniques, we also consider using meta-learning techniques to further improve CyCADA. We use Cycle-GTA5 (GTA5 after CycleGAN style transfer) for meta-training and Cityscapes for meta-testing. Our key idea is that there is a domain gap between the synthetic and real datasets. Although we previously assume no annotation for test data in the domain adaptation problem, it is actually easy to provide just a few shots of real image annotations. After trained with few-shot segmentation techniques, it can be expected that our model can learn from those few real domain examples the appearance of real pixels. We train our network in a two-step way. The first step is to train it with CyCADA and the second step is to take the trained model and train it in a few shot manner with Guided Network. Our preliminary results are shown in the presentation slides.