Motivation
State estimation and reconstruction play pivotal roles in various cutting-edge applications such as autonomous vehicles, AR/VR devices, and robotics. For instance, in the case of an autonomous vehicle, it’s crucial to create a detailed map of its surroundings while accurately determining its location within that map.
In particular we want to create dense maps as Dense reconstructions significantly enhance interactivity and utility – especially for AR/VR and robotics as humans and robots start to share workspace

However, existing classical dense SLAM methods predominantly rely on modular pipelines, which suffer from minimal information sharing and a lack of utilization of priors. Consequently, these methods are susceptible to environmental changes and require extensive fine-tuning for different settings.
Through this work we want to design a robust method for explicit learning based dense visual SLAM that can achieve online Real time, high accuracy visual odometry and dense mapping in urban and in the wild setting. For the scope of this project – we want to focus on urban SLAM setting as it is tougher setting then indoor and is relatively under explored.
Background
Precise correspondences enable precise reconstruction:
- Given precise correspondences- reconstruction is a geometrically grounded task
- Instead of teaching the network to reason about geometry, we focus on learning precise correspondences, which can be utilized for reconstruction

Our solution : Match Anything
- One architecture for both dense and sparse matching
- Simple architecture trained with large scale data
