Synthetic Data Collection

Framework

We must have sufficient data to train our models such that they can generalize to different kinds of aircraft and to different hangar environments.

Since it’s difficult to obtain enough training data containing real aircraft, we create a synthetic dataset using a 3D CAD model. We use a CAD model of a C17 aircraft downloaded from the internet for this purpose. The downloaded model is rendered from thousands of randomly generated viewpoints to create a dataset containing images of aircraft and corresponding poses.

However, this only creates a dataset of aircraft rendered on plain backgrounds. To make the data more realistic and to teach the network to distinguish between foreground and background, we compose the rendered images of aircraft with plausible background images downloaded from the internet. Composing aircraft images with random backgrounds also serves the additional purpose of artificially expanding the size of the dataset. We generate 5000 images of aircraft and download 300 background images. By randomly composing foreground and background images, we expand the size of the dataset to 5000×300 = 1.5 million. The data generation pipeline described is shown in the figure above.

Dataset

The figure above shows examples of images used as inputs to the model during training.

Model Training

Overview

We train a model to directly regress the pose given an image containing an aircraft. We parametrize the rotation using quaternions to improve the trainability and training stability of our model. This simple framework is shown in the figure above.

Training Configuration

The configuration of training is described in the table below:

Encoder	ResNet-34 without final classifier
Regressor	Fully connected layer with output size 7
Loss Function	L1 loss
Learning Rate	1e-3 with step decay after each epoch
Batch Size	16
Epochs	100

Table: Training Configuration