mengqinj, Author at Open Vision Platform for Smart City Intersections & VQA from Camera Networks

We have three sub-projects and here are brief introductions. The combined version of slides can be found here.

Single-view 3D Keypoint Reconstruction for Vehicles

This project aims to reconstruct 3D keypoints of vehicles from single-view images so as to detect and track vehicles in 3D space. Given the vehicle keypoint detection result from Occlusion-Net, we proposed 1) Car-Centric RANSAC to reject outliers and recover pose, 2) dimension reduction by PCA on features, 3) joint-optimization on vehicle PCA coefficients and poses. To refine the tracking results, we further conducted trajectory spline fitting and pose rectification.

[slides]

Two-view human pose reconstruction

This project aims to reconstruct 3D keypoints of humans in the indoor scenes from two-view of cameras. The work consists of two parts. The first part is to reconstruct the 3D keypoints from paired 2D keypoints detection using the standard epipolar geometry method. To further make the reconstructed joints location more accurate, in the second part, we borrowed the idea from PoseFix net and extended it from 2D to 3D. This would fix the joint locations directly in 3D space, which reduces the reconstruction error.

[slides]

Semi-supervised vehicle detection with missing labels

Zensors provides service for users to upload a camera view, select a region of interest, and raise a question like, “how many cars are there in the parking lot?” Hence the Zensors dataset is partially labeled, i.e. only objects inside the region of interest have annotations. We proposed three methods to leverage such a partially labeled dataset to train a detection model: 1) Confidence score based self-training, 2) DeepCluster, and 3) Discriminator for positive and background examples. We use COCO dataset and randomly drop 50% labels to simulate this problem.

[slides]

Other documents

Our capstone presentation on 11/01/2019.

Author: mengqinj

Final presentation