Real-time multi-person pedestrian tracking is a major component of many applications, e.g. autonomous driving or autonomous construction scenarios where pedestrians have to be detected and tracked in real time for safe operations.
Recently, a lot of works have tackled the problem of real-time single object tracking. Although achieving state-of-the-art accuracy while running as fast as hundreds of frames per second, these algorithms focus on tracking single object only, which is sub-optimal in realistic settings.
There are other recent methods that focus on multiple object tracking, which utilize a detection+tracking+assfication architecture. These methods achieve great performance but lack the ability to run in real time, which is essential to our work.
Problem
The main challenge is to handle some difficult scenarios, such as occlusion, appearance change, pedestrian crossing.
The second challenge is that the time complexity will increase with more tracking objects.
Solution
We are trying to use one-stage network for both detection and matching.
We believe Graph Neural Network (GNN) is useful for simultaneous detection and association
We designed a Non-Maximum Suppression specifically tailored for the tracking task