Introduction

Motivation

Autonomous navigation requires a robot to navigate and interact with a dynamic and open environment. For safe navigation, the vehicle should identify other agents of the environment, such as people and other vehicles, to avoid collisions. At the same time, it is essential to understand the semantics of the static scene, such as the drivable region of the environment. Lidar Panoptic Segmentation (LPS) aims to solve both of these tasks, i.e., jointly tackling instance segmentation and semantic segmentation for point clouds.

Fig 1: If an autonomous vehicle has not seen strollers before, this may lead to a catastrophe.

However, the LPS setup fails to consider realistic testing environments. First, in the real-world, there may be constant distribution shifts and LPS does not account for this. Second, and more crucially, the network is only trained to segment regions which belong to a predefined vocabulary of K classes. However, if a rare object such as a stroller is encountered (Fig 1), the network is not trained to recognize this, which may lead to catastrophic consequences. In an ideal situation, the network should alert such points as unknown or other (i.e., not seen in training set).

The ability to segment these unseen classes is very important for safety-critical applications such as autonomous navigation. Additionally, it may also be useful for data-labelling, especially in the long-tail.

Problem Statement

Left: LPS aims to classify points into one of K pre-defined classes.
Right: In contrast, LiPSOW requires novel objects to be segmented as other

To overcome the challenges of using LPS methods in a real-world setting, we propose a new task which extends LPS in an open-world (LiPSOW). Under LiPSOW, methods are evaluated on a test-set which have a different data distribution than the training set. Therefore, to do well on LiPSOW, the method should be capable of handling shifting domains. Moreover, to account for novel objects from the long-tail, we evaluate algorithms by their ability to recognize other points and segment instances from other. In summary, in addition to stuff and thing classes (as is the case in LPS), we introduce an other class, which may internally consist of stuff or things, and LiPSOW methods must segment instances from both thing and other classes.

Our Contributions

In this work, our contributions are three-fold:

  1. We introduce a new problem setting, Lidar Panoptic Segmentation in an Open World (LiPSOW). We also establish evaluation protocols to evaluate methods under LiPSOW.
  2. Using existing work in LPS, we develop strong baselines for LiPSOW and analyze their performance.
  3. We propose our method, Hierarchical LiDAR Panoptic Segmentation (HLPS), which combines geometric clustering and lidar semantic segmentation to achieve strong performance on LiPSOW.