Project Summary

Motivation

Modeling and understanding pedestrian behavior is an important component of building safe and secure smart cities. It is one of the primary components of video surveillance and has drawn increasing attention in recent years for various applications like pedestrian walking path prediction, traffic flow segmentation, crowd counting and segmentation, and abnormal event detection. So why predict pedestrian trajectories? Some applications are:

Human Behavior Analysis:
1. Security Surveillance
2. Action Prediction
3. Planning of intersections
Safety in Autonomous vehicles:
1. Trajectory planning
2. Improves safety

Why is it challenging?

Pedestrian behavior modeling is challenging, especially for scenes with crowds. Previous studies have shown that the walking behavior of an individual can be influenced by a variety of factors including scene layout (e.g. entrances, exits, walls, and obstacles), pedestrian beliefs (the choice of source and destination), and interactions with other moving pedestrians.

Understanding the context of human behavior and actions is challenging

As seen in the figure above, the same action can have different meanings based on the context and situation. Thus, understanding the context of specific human actions can help predict anomalous activities like crimes in advance. This will ultimately enable us to build behavioral twins of intersections in smart cities.

Problem Statement

The target domain of this project
Different scenarios of this problem

As explained in our motivation above, we aim to model and understand human behavior at traffic intersections. We aim to leverage 2D pose estimation in multiple views, triangulation, and 3D trajectory forecasting to predict 3D pose trajectories.

Problem statement – Go beyond 2D point trajectories to predict 3D pose trajectories

The two possible scenarios for this problem statement are trajectory forecasting and action forecasting. However, the focus of this project is trajectory forecasting.

Project goal

The project goals are:

1. Predict 3D trajectory and poses for each pedestrian in the scene

2. Model each pedestrian with a 3D skeleton and not just 2D point trajectories.

3. Leverage high-resolution and time-synchronized birds-eye-view static cameras with known camera matrices.

The three steps to solving this problem are:

Step 1: Pose estimation

Example of a pose estimation model

Pose estimation helps predict the 2D joint locations of every pedestrian in the frame.

Step 2: Triangulation

Computer Vision Group
Example of triangulation to estimate 3D world coordinates using 2 camera views

Triangulation is used to obtain ground truth 3D pose sequences for each pedestrian. Using the camera matrices and 2D pose information of a pedestrian from at least two camera views, we can estimate the 3D pose information of the given pedestrian.

Step 3: Trajectory Forecasting

3D trajectory forecasting

The final block in our pipeline is trajectory forecasting. It uses the 3D pose sequence information to predict the most probable 3D trajectories for each pedestrian.

Code

The code for our project can be found at https://github.com/Michael-MuChienHsu/pedestrian_prediction