Semantic Maps for Relocalization

Setup

Given trajectories of multiple passes from a scene, we collect pairs of frames which are close in the world coordinate system:

Fig 1: Pairs of RGBD frames are used with their relative pose as ground truth, to evaluate relocalization
Fig 2: Pipeline for PnP based alignment of an RGBD frame with 3D ORB keypoints for relative pose estimation

Compared to brute-force based matching, semantic map based matching will only take correspondendences which belong to same semantic category in both the images.

Quantitative Results

Table 1: Accuracy of relocalization for Brute Force, Semantic-map based, and hybrid matching. Relocalization is said to be successful if the estimated pose error is < ~0.1m in translation, and < ~30deg in rotation.

We see that the brute force approach outperforms the semantic map based approach, however, the error both the approaches make seem to be complimentary:

Fig 3: The figure shows error difference between semantic and bruteforce approaches, sorted in increasing order. Positive value means, for that sample, semantic map based matching creates more error, and vice-versa.

Therefore, we try out a hybrid approach, where both methods are used, and number of inliers is used to choose one between them. Figure below shows that it does improve the performance, but as evident in Table 1, that improvement is not very high.

Fig 4: The figure shows error difference between semantic and hybrid approaches, sorted in increasing order. Positive value means, for that sample, semantic map based matching creates more error, and vice-versa. Notice how it it like a thinned out version of previous graph.

Qualitative Results

The qualitative results below show a pair of images in RGB and in semantic-map format. The matches are done on RGB features, but semantic maps were used to guide them, so that’s why we show the semantic maps in (ii) approach. Translation and rotation errors are written for all cases.

A: Semantic maps performing better than brute force approach:

B: Brute force approach performing better: