Experiments - Finding Dents and Dings With a Drone using Structured Light

Dataset

We conducted experiments on an aircraft fuselage to evaluate the performance of our pipeline (see image below). Artificial dents were introduced manually using a hammer, with depths ranging from 1–5 mm and an average depth of approximately 2 mm. In total, 391 dents were created across a 3 × 0.75 m area. Using our pipeline, scanning the entire fuselage required roughly 10 minutes.

The aircraft fuselage we experiment on

Rig Setup

To set up the two-camera stereo system, we built a rig that holds both cameras and the laser (see images below). This ensures that the cameras and laser share a common baseline and maintain fixed relative positions. Note that this configuration is intended only as a convenient test platform for evaluating our pipeline. For practical deployment, the system could be mounted on a drone carrying the cameras and the laser.

Rig setup that holds cameras and the laser

First Stage Results

The results of the first-stage pipeline are shown below. For images captured at a distance of 1.5 m, we achieve a recall of 93.6%. However, when the camera is moved farther from the aircraft—approximately 2.5 m—the recall drops sharply to 76.2%. This decline occurs because, at greater distances, many laser-line deformations fall below 1 px and become difficult for computer vision algorithms to detect. It is also worth noting that the precision at this stage is low: various artifacts such as screws, holes, and rivets can also distort the laser line, leading to numerous false positives. These are addressed in the second-stage pipeline.

Images taken with the cameras. The under-exposed image (right) is the input to the first stage pipeline.

Distance	TP	FN	FP	Precision	Recall
1.5m	366	25	131	73.6%	93.6%
2.5m	298	93	134	69.0%	76.2%

Result of the first-stage pipeline

Second Stage Results

For the second stage, we use the patches extracted in the first stage as our dataset. We evaluate the performance of the ML classifier using 5-fold cross-validation. The resulting confusion matrix is shown below, with the classifier achieving an accuracy of 89.6%. Note that this accuracy is somewhat limited by the relatively small size of our dataset (approximately 1,000 patches). We expect that expanding the dataset would substantially improve the classifier’s performance.

Input patches to the second stage pipeline

	Predicted Negative	Predicted Positive
Actual Negative	0.89	0.10
Actual Positive	0.11	0.90

Confusion Matrix of the Second Stage ML classifier

Final Results

The table below shows the performance of the whole pipeline. The 2nd stage pipeline will improve the precision by ~24% with the cost of slightly lowering the recall rate by ~8%.

Distance	TP	FN	FP	Precision	Recall
1.5m	329	62	14	95.8%	84.2%
2.5m	268	122	14	94.7%	68.6%

Result of the whole pipeline