Interpretable Deep Learning Frameworks for 3D Point Cloud Sampling
基於深度學習的三維點雲採樣可解釋性網絡
Student thesis: Doctoral Thesis
Author(s)
Related Research Unit(s)
Detail(s)
Awarding Institution | |
---|---|
Supervisors/Advisors |
|
Award date | 23 Nov 2021 |
Link(s)
Permanent Link | https://scholars.cityu.edu.hk/en/theses/theses(b2129c7a-b698-4a0d-bad9-df246c0d5a70).html |
---|---|
Other link(s) | Links |
Abstract
In recent years, three-dimensional (3D) data are widely used in many applications, such as 3D city reconstruction, autonomous driving and virtual/augmented reality. 3D point cloud is a set of discrete points located on the scanned surface. Compared with other 3D representations, point cloud is a raw 3D representation that can be directly obtained by scanning devices such as LiDAR. Though 3D sensing technology has been greatly improved in recent years, it is still costly and time-consuming to acquire dense point clouds for representing shapes with rich geometric details. In the meantime, processing dense 3D point clouds is still challenging due to the high cost of computation, storage, and communication load. Therefore, point clouds with various resolutions are demanded for practical use.
Point cloud sampling, including downsampling and upsampling, is a popular technique to adjust point cloud resolution without modifying the hardware devices. Point cloud downsampling focuses on selecting a subset of the original dense point cloud to reduce information redundancy, thereby improving the runtime performance of the downstream applications and saving storage space and transmission bandwidth. By contrast, point cloud upsampling focuses on generating a dense point cloud that can faithfully represent the underlying surface by a given sparse point cloud. Sampling techniques specially designed for point cloud data can be categorized into traditional methods and deep learning-based methods. Traditional methods heuristically sample point clouds or solve optimization problems that are formulated under some prior assumptions about data. Since traditional methods cannot fully exploit the data properties, they only show limited performance. The deep learning-based methods can effectively learn structures from data, which results in better performance than the traditional methods. However, the existing deep learning-based methods are heavily inspired by the technologies in 2D vision tasks, but take little consideration of the local neighborhood information and the geometric properties of the input shape.
In this thesis, we discuss three novel deep learning-based point cloud sampling frameworks: MOPS-Net for point cloud downsampling, PUGeo-Net and MAFU-Net for point cloud upsampling. More specifically, MOPS-Net is designed from the perspective of matrix optimization. We relax the binary restriction of the variables and formulate a constrained and differentiable matrix optimization problem. MOPS-Net is a deep learning framework that mimics the matrix optimization by exploring both the local and global structures of the input data. The upsampling network PUGeo-Net incorporates discrete differential geometry into deep learning elegantly by learning the first and second fundamental forms that are able to fully represent the local geometry unique up to rigid motions. As a by-product, PUGeo-Net can compute normals for the original and generated points, which is highly desired for surface reconstruction algorithms. Another upsampling framework is MAFU-Net. It takes advantage of the linear approximation theorem to formulate the upsampling problem explicitly. The adaptive interpolation and high-order refinements are two primary components of MAFU-Net which can be optimized end-to-end. Moreover, MAFU-Net only requires a single neural network with one-time training to handle various upsampling factors by a simple yet effective training strategy. All the methods are interpretable data-driven methods, which formulate the upsampling problem explicitly by considering point cloud properties. They can achieve state-of-the-art performance with the interpretable and compact deep learning networks.
Point cloud sampling, including downsampling and upsampling, is a popular technique to adjust point cloud resolution without modifying the hardware devices. Point cloud downsampling focuses on selecting a subset of the original dense point cloud to reduce information redundancy, thereby improving the runtime performance of the downstream applications and saving storage space and transmission bandwidth. By contrast, point cloud upsampling focuses on generating a dense point cloud that can faithfully represent the underlying surface by a given sparse point cloud. Sampling techniques specially designed for point cloud data can be categorized into traditional methods and deep learning-based methods. Traditional methods heuristically sample point clouds or solve optimization problems that are formulated under some prior assumptions about data. Since traditional methods cannot fully exploit the data properties, they only show limited performance. The deep learning-based methods can effectively learn structures from data, which results in better performance than the traditional methods. However, the existing deep learning-based methods are heavily inspired by the technologies in 2D vision tasks, but take little consideration of the local neighborhood information and the geometric properties of the input shape.
In this thesis, we discuss three novel deep learning-based point cloud sampling frameworks: MOPS-Net for point cloud downsampling, PUGeo-Net and MAFU-Net for point cloud upsampling. More specifically, MOPS-Net is designed from the perspective of matrix optimization. We relax the binary restriction of the variables and formulate a constrained and differentiable matrix optimization problem. MOPS-Net is a deep learning framework that mimics the matrix optimization by exploring both the local and global structures of the input data. The upsampling network PUGeo-Net incorporates discrete differential geometry into deep learning elegantly by learning the first and second fundamental forms that are able to fully represent the local geometry unique up to rigid motions. As a by-product, PUGeo-Net can compute normals for the original and generated points, which is highly desired for surface reconstruction algorithms. Another upsampling framework is MAFU-Net. It takes advantage of the linear approximation theorem to formulate the upsampling problem explicitly. The adaptive interpolation and high-order refinements are two primary components of MAFU-Net which can be optimized end-to-end. Moreover, MAFU-Net only requires a single neural network with one-time training to handle various upsampling factors by a simple yet effective training strategy. All the methods are interpretable data-driven methods, which formulate the upsampling problem explicitly by considering point cloud properties. They can achieve state-of-the-art performance with the interpretable and compact deep learning networks.