Deep Regular Geometry Representations for 3D Point Cloud Processing
DescriptionThree-dimensional point clouds (3D-PCs) have been adopted in a wide range of applications, such as immersive telepresence, virtual/augmented reality, geographic mapping, and autonomous driving. Motivated by the highly successful deep learning techniques for 2D images and videos, deep learning for 3D-PCs is attracting much attention in both academia and industry. Unfortunately, unlike 2D images/videos defined on the regular Euclidean structure, 3D-PCs are sets of irregular and unordered spatial points with the underlying structure in a non-Euclidean space. This difference poses great challenges in transferring mature learning techniques developed in 2D regular domains to irregular 3D geometric domains. Despite recent advances in highly specialized processing pipelines to overcome irregularity and unstructuredness, their performance and computational efficiency still remain unsatisfactory.To address the challenges posed by the irregularity, this project aims to investigate a novel regular 2D geometry representation for irregular 3D-PCs. Conceptually, we propose to convert a 3D-PC with arbitrary geometry and topology to a three-channel 2D lattice structure, called 2D point image (2D-PI), where point coordinates are captured in grid pixels while maintaining neighborhood consistency. Specifically, we will investigate an unsupervised learning framework to achieve the 2D-PI representation modality. Due to the regular structure, numerous 2D image/video learning techniques, such as convolutional neural networks and mature image/video codecs, are readily available to 2D-PIs to achieve low-level geometry modeling/processing and high-level 3D shape analysis/understanding. Moreover, driven by its unique structural advantages, we will explore 2D-PI tailored learning operators to boost downstream processing. Since the structural regularity of 2D-PIs can naturally avoid cumbersome data structurization, we will investigate efficient hierarchical learning frameworks to extend the 2D-PI representation to large-scale 3D-PCs, which is urgently needed in real world applications but is unfortunately ignored by most previous studies due to computational limitations. Finally, by additionally exploring cross-frame spatio-temporal consistency, we will extend the 2D-PI representation modality to dynamic 3D-PC sequences, generating 2D point videos to enable the application of well-developed video processing techniques for dynamic 3D-PC processing.With our solid technical backgrounds and promising preliminary results, we envision that our investigations will provide practical solutions to efficient geometry representation of 3D-PCs, bring novel insights for geometry processing, and open new paradigms for deep learning based 3D-PC processing to advance such an emerging field. We believe that the scientific findings of this project will continuously motivate the subsequent academic exploration of 3D-PC processing and general downstream applications of geometric signal modeling.
|Effective start/end date||1/01/23 → …|