Cross-modal Global Localization with a LiDAR and Geo-referenced Aerial Images for Autonomous Vehicles in GNSS-degraded Environments

Project: Research

View graph of relations


In recent years, autonomous driving has demonstrated its feasibility in a number of applications. But there are still many open issues that need to be addressed to enable large-scale adoption of autonomous vehicles. One of the open issues is the global localization in Global Navigation Satellite System (GNSS)-degraded environments. Global localization is a fundamental capability for autonomous vehicles. Most of the existing solutions for global localization rely on GNSS. However, GNSS could often be degraded due to the reflections or multi-path effects of GNSS signals, especially in urban environments, such as urban canyons. Visual- or LiDAR-based place recognition and re-localization methods could relieve the need for GNSS, however, these methods require pre-building a geo-referenced visual or point-cloud database or map with a data-collection vehicle, making these methods tedious, time-consuming and expensive. Moreover, vision-based methods also suffer from the changes of seasons, weather, illuminations and viewpoints, etc. To address the above challenges, we propose a novel cross-modal solution by matching on-vehicle LiDAR point clouds and off-the-shelf geo-referenced aerial images (e.g., Google satellite images) for global localization. The aerial images provide the global location information (i.e., longitude and latitude) as an alternative to GNSS. Since the aerial images are already available, the tedious work of driving a data-collection vehicle to pre-build a geo-referenced database can be relieved. Due to the large modality gap between on-vehicle LiDAR point clouds and aerial images, the ground-to-aerial cross-modal matching is still challenging, and remains largely underexplored. We propose to solve this problem by generating dense semantic bird-eye-view (BEV) maps from the two heterogeneous data, and employing the BEV maps as an intermediate modality for data matching to achieve place recognition. With the place recognition results, we propose to register the two semantic BEV maps to produce global location constraints, and fuse LiDAR odometry with the constraints to achieve global metric localization. In summary, our new solution will be achieved by mainly innovating the following components: (1) dense semantic BEV map generation from sparse LiDAR point clouds; (2) semantic graph-based descriptor extraction and graph neural network (GNN)-based location retrieval; (3) semantic BEV map registration to produce global location constraints; (4) metric localization by fusing state-of-the-arts LiDAR odometry with the global constraints. The outcomes of this project will not only advance knowledge in terms of high-impact publications, but also release open-source implementations that can be used practically to increase the robustness and accuracy of global localization in GNSS-degraded environments for autonomous vehicles.


Project number9043642
Grant typeGRF
Effective start/end date1/09/23 → …