Introduction - Forecasting for Autonomous Driving

Autonomous Driving

Autonomous driving is a fairly complex domain that involves navigating in diverse environments without any human intervention. One of the high-level goals of such systems is to provide safe and comfortable motion planning to the desired location. It is challenging because of uncertainty in the future prediction and many times technical challenges related to the requirement of a large amount of annotated datasets for supervised learning.

Forecasting

Motion forecasting is one of the major pieces to solve the problem of safe navigation. It refers to estimating the future state of the scene consisting of many actors ( pedestrians, vehicles etc) so that the ego-vehicle can take appropriate action in time. It is required for safety as well as comfortable experience.

Project

Phase 1

With this project, we intend to explore methods in the domain of forecasting. Specifically, we are exploring methods related to freespace forecasting. Freespace is any drivable region for the ego-vehicle (non-occupied drivable region). We aim to forecast this freespace by taking the historical freespace as input. The major motivation for this approach is that this learning can be done in a self-supervised way without any annotation requirement. Object-based forecasting requires human annotation of objects and their locations which is very costly and limited. Freespace can be easily obtained simply by raycasting using a LiDAR sensor while driving the vehicle. We are also exploring methods to combine these LiDAR based freespace forecasting with image-based methods to utilize the complementary information of each modality.

Phase 2

We witnessed the effectiveness of using freespace as additional supervision in models. We hypothesize that it is due to the fact that explicit supervision about the future freespace can help the model to learn more effectively. But we also saw that the model still struggles to learn cues from stop signs or traffic lights. One possible reason is that implicitly learning signals from such small objects in the image is difficult for the model. Hence, in this phase we intend to build upon the idea of explicit supervision by training model to predict cost maps related to the important entities like stop signs, traffic lights, waypoints, freespace etc. We intend to use the Carla leaderboard dataset to supervise the training to output separate cost maps for these entities. These cost maps can be combined and an optimal trajectory can be inferred.