Single-view 3D Keypoint Reconstruction for Vehicles (Xudong Chen, Fangyu Li) - Open Vision Platform for Smart City Intersections & VQA from Camera Networks

The aim of this project is to recover 3D poses and the shape of vehicles at the intersections in contribution to the “Open Vision Platform for Smart City”. Successful estimation of 3D information of the vehicles open possibilities for many intelligent city applications, such as traffic analytics, vehicle localization, velocity estimation, abnormal detection, trajectory prediction. Furthermore, using a single view to reconstruct the 3D shape and pose of the object is very challenging and has been extensively studied in the computer vision community. In this project, we utilize the vehicle trajectory in the video sequence as an additional cue to rectify the poses.

The algorithm pipeline is as follow:

the initial pose is estimated by solvePnP with car-centric RANSAC:

to optimize the 3D shape and pose of each vehicle, we use Lavern Marquardt (LM) optimization to jointly optimize the shape coefficients and poses. The loss terms are as follow:

and the result of the optimization is shown below:

after obtaining the pose of each vehicle at different timestamps, we could fit their trajectories into a second-order spline which is obtained by the 3D poses estimated from optimization at each timestamp:

finally, we visualize the effect of pose rectification from the fitted trajectory:

A visualization of the 3D reconstruction could be seen here:

The video of the presentation: