Geometry-Aware Deep Modeling of Dynamic 3D Point Cloud Sequences
具幾何感知的動態三維點雲序列深度建模
Student thesis: Doctoral Thesis
Author(s)
Related Research Unit(s)
Detail(s)
Awarding Institution | |
---|---|
Supervisors/Advisors |
|
Award date | 20 Feb 2024 |
Link(s)
Permanent Link | https://scholars.cityu.edu.hk/en/theses/theses(c50f1007-7f55-4bda-b912-00200a6c29a5).html |
---|---|
Other link(s) | Links |
Abstract
3D point cloud sequences, pivotal in modern applications such as urban modeling, autonomous driving, and augmented reality, pose numerous computational challenges. Essentially, these sequences are collections of 3D point clouds that capture the evolution of a scene or object over time, similar to how a video captures 2D frames in succession. They are primarily obtained through advanced sensing technologies, such as LiDAR, which continuously scans and records spatial data from a changing environment. However, crafting precise and efficient representations for 3D point cloud sequences is challenging due to the unstructured nature of each frame and the unknown spatial-temporal relations between multiple frames of 3D point clouds. Thus, in contrast to static point clouds, 3D point cloud sequences necessitate a methodological paradigm that concurrently facilitates feature extraction across both the spatial and temporal domains. This inherent complexity underscores the need for precise deduction and integration of temporal correspondences in 3D point cloud sequences, a crucial step for progress in dynamic point cloud processing.
3D point cloud sequence processing is an emerging field garnering significant academic attention. Three predominant areas of emphasis within this domain are correspondence, interpolation, and representation. Point cloud correspondence is concerned with identifying and matching similar structures or features across diverse point clouds, essential for aligning multiple scans of similar objects and ensuring semantic consistency. This technique is integral to applications such as object recognition and reconstruction. Interpolation focuses on enhancing the temporal resolution of a point cloud sequence, playing pivotal roles in areas like autopilot, immersive communication, computer animation, and virtual/augmented reality. Lastly, representation aims at the efficient and accurate storage of sequential point cloud data, which is vital for rapid data retrieval and in-depth analysis.
In this thesis, we delve into deep learning frameworks designed for 3D point cloud sequences, connecting three central research areas. The first, CorrNet3D, is an unsupervised method that focuses on dense 3D shape correspondence via deformation-like reconstruction. This method forms the foundation for further exploration, showcasing its adaptability and superior performance over existing models. Building on this, IDEA-Net addresses the challenges of dynamic 3D point cloud interpolation with extensive non-rigid deformations. It does not merely interpolate; it also ensures the maintenance of correspondence in the resulting sequences. With insights from CorrNet3D, IDEA-Net effectively improves 3D motion data acquisition, emphasizing the importance of temporal consistency. Lastly, the thesis presents Structured Point Cloud Video (SPCV), a novel way to represent sequences. SPCV structuralizes 3D point cloud sequences by encoding them into 2D color videos. This innovative approach harnesses the power of established 2D data techniques, marking a significant stride in 3D sequence processing and expanding potential avenues for research and analysis. In essence, this thesis presents approaches encompassing foundational correspondence, interpolation, and structuralized representation of 3D point cloud sequences, offering a comprehensive exploration of their processing.
3D point cloud sequence processing is an emerging field garnering significant academic attention. Three predominant areas of emphasis within this domain are correspondence, interpolation, and representation. Point cloud correspondence is concerned with identifying and matching similar structures or features across diverse point clouds, essential for aligning multiple scans of similar objects and ensuring semantic consistency. This technique is integral to applications such as object recognition and reconstruction. Interpolation focuses on enhancing the temporal resolution of a point cloud sequence, playing pivotal roles in areas like autopilot, immersive communication, computer animation, and virtual/augmented reality. Lastly, representation aims at the efficient and accurate storage of sequential point cloud data, which is vital for rapid data retrieval and in-depth analysis.
In this thesis, we delve into deep learning frameworks designed for 3D point cloud sequences, connecting three central research areas. The first, CorrNet3D, is an unsupervised method that focuses on dense 3D shape correspondence via deformation-like reconstruction. This method forms the foundation for further exploration, showcasing its adaptability and superior performance over existing models. Building on this, IDEA-Net addresses the challenges of dynamic 3D point cloud interpolation with extensive non-rigid deformations. It does not merely interpolate; it also ensures the maintenance of correspondence in the resulting sequences. With insights from CorrNet3D, IDEA-Net effectively improves 3D motion data acquisition, emphasizing the importance of temporal consistency. Lastly, the thesis presents Structured Point Cloud Video (SPCV), a novel way to represent sequences. SPCV structuralizes 3D point cloud sequences by encoding them into 2D color videos. This innovative approach harnesses the power of established 2D data techniques, marking a significant stride in 3D sequence processing and expanding potential avenues for research and analysis. In essence, this thesis presents approaches encompassing foundational correspondence, interpolation, and structuralized representation of 3D point cloud sequences, offering a comprehensive exploration of their processing.