Synthetic Data Collection


We must have sufficient data to train our models such that they can generalize to different kinds of aircraft and to different hangar environments.

Since it’s difficult to obtain enough training data containing real aircraft, we create a synthetic dataset using a 3D CAD model. We use a CAD model of a C17 aircraft downloaded from the internet for this purpose. The downloaded model is rendered from thousands of randomly generated viewpoints to create a dataset containing images of aircraft and corresponding poses.

Figure: Data generation pipeline

However, this only creates a dataset of aircraft rendered on plain backgrounds. To make the data more realistic and to teach the network to distinguish between foreground and background, we compose the rendered images of aircraft with plausible background images downloaded from the internet. Composing aircraft images with random backgrounds also serves the additional purpose of artificially expanding the size of the dataset. We generate 5000 images of aircraft and download 300 background images. By randomly composing foreground and background images, we expand the size of the dataset to 5000×300 = 1.5 million. The data generation pipeline described is shown in the figure above.


Figure: Examples of foreground images from the dataset rendered with random backgrounds

The figure above shows examples of images used as inputs to the model during training.

Model Training


Figure : Simplified model architecture

We train a model to directly regress the pose given an image containing an aircraft. We parametrize the rotation using quaternions to improve the trainability and training stability of our model. This simple framework is shown in the figure above.

Training Configuration

The configuration of training is described in the table below:

EncoderResNet-34 without final classifier
RegressorFully connected layer with output size 7
Loss FunctionL1 loss
Learning Rate1e-3 with step decay after each epoch
Batch Size16
Table: Training Configuration