Summary
In this semester, our progress can be categorized into two folds:
(1) Panoptic Segmentation
(2) From Panoptic Segmentor to Detector
By the end of this semester, we also submitted our ongoing version of Panoptic Segmentor to NeurIPS ’21 AI Driving Olympics Workshop, and ranked TOP 3 in both LiDAR-track and Open-track (we are Team_AX_Semantic).
Presentation Links
- NeuRIPS ’21 AI Driving Olympics Workshop [Link]
Panoptic Segmentation
To improve the panoptic segmentation quality, we made 4 modifications based on the existing Panoptic Segmentation method DS-Net.
The first 3 modifications are illustrated in following two figures:
With these 3 modifications, we successfully achieved SOTA performance in nuScenes. We also achieved 2nd performance in SemanticKITTI without integrating Accumulation.
We carefully ablate the performance improvement due to the modifications in the following table.
Issues with the current model
During the experiments, we noticed that one of the most critical issues of the current method is low recall, compared with SOTA 3D object detector.
The major reasons are that 1) detector tends to give out redundant boxes and reject later 2) Segmentor tends to give one best label for each point.
Therefore, to improve the recall of our panoptic segmentor, we allowed more than one predicted labels for each point, instead of doing clustering over the top one semantic label. This significantly improved the recall of our model. However, the next step is how to improve precision.
We are still working on improving the precision, but we did some initial experiments to verify some possible directions. We trained a “scorer” using MLP regressor, to predict confidence score for each predicted 3D bounding box. By plotting the distribution of the predicted scores from this “scorer” for positive/negative samples, we notice that it is able to perform reasonably well because on average it can assign score > 0.5 for positive samples and score < 0.5 for negative samples.
From Panoptic Segmentor to Detector
We aim to explore the possibility of extending a Panoptic Segmentor into a 3D Object Detector for a normal 3D LiDAR Object Objection task. To achieve this goal, we designed and trained an MLP regressor capable of outputting amodal bounding box, based on the instance labels from our Panoptic Segmentor. Here is a definition of amodal bounding box:
Our first trial is heuristic method which simply calculates the mean of all points belonging to an instance as box center. However, it can only output a modal box instead of amodal box. Therefore, we propose a MLP regressor:
Our proposed MLP regressor performs much better compared with the heuristic method, but still suffer from cases where the number of points for an instance is very few.
For more information, please refer to our presentation in NeurIPS ’21 AI Driving Olympics Workshop.