Abstract
The accelerated development of 3D acquisition technologies has spurred the emergence of immersive applications, including virtual reality and digital twins. The reliance of these 3D applications on point clouds as fundamental data has catalyzed an exponential surge in data volume. This challenge strains existing storage capacities and transmission bandwidth, thereby necessitating efficient compression techniques. However, the inherent characteristics of point clouds, particularly the sparsity, irregularity, and unordered nature, pose substantial challenges for effective data compression. Conventional point cloud compression methods are developed based on handcrafted coding tools, exhibiting limited efficacy in reducing redundancy and inferior performance. To address these limitations, this thesis focuses on learning-based compression and quality assessment paradigms for 3D point cloud data, enabling adaptive modeling of intricate data characteristics for redundancy elimination and superior coding performance. This thesis is structured around three core contributions: 1) enhancing point cloud geometry compression using compact geometric priors; 2) designing an efficient coding framework for Gaussian point clouds; 3) pioneering visual quality assessment for compressed Gaussian point clouds.In the first part, this thesis proposes a deep geometry coding framework, which leverages effective geometric priors as reference to facilitate the residual coding of human point clouds. Specifically, these priors are represented by a scaffold based on a parametric human template, which is further compactly encoded as side information by a few bits. Intermediate point clouds are then derived from the scaffold as spatial reference for the compression of original human point clouds. In this process, features of intermediate point clouds are warped onto source point clouds to predict residual features, thereby culminating in a substantial mitigation of spatial redundancy. These residuals are then efficiently encoded into bitstreams. Experimental results demonstrate significant geometry compression improvements, validating the efficacy of incorporating geometric prior information.
In the second part, this thesis proposes an efficient compression framework for Gaussian point clouds. Unlike traditional point clouds, Gaussian point clouds involve additional volumetric attributes to achieve high-fidelity point-based rendering. Consequently, Gaussian point clouds have substantial data volume, which impedes practical utility in real-world applications. Herein, we propose an efficient 3D scene representation, named Compressed Gaussian Splatting (CompGS), harnessing compact Gaussian point clouds for faithful 3D scene modeling with a remarkably reduced data size. Specifically, to ensure the compactness of Gaussian point clouds, we devise a hybrid primitive structure that captures predictive relationships between each other. Then, we exploit a small set of anchor primitives for prediction, allowing the majority of primitives to be encapsulated into highly compact residual forms. Moreover, we develop a rate-constrained optimization scheme to eliminate redundancies within such hybrid primitives, steering our CompGS towards an optimal trade-off between bitrate consumption and representation efficacy. Experimental results show that the proposed CompGS significantly outperforms existing methods, achieving superior compactness in 3D scene representation without compromising model accuracy and rendering quality.
In the third part, this thesis pioneers visual quality assessment for compressed Gaussian point cloud data, investigating the impact of compression on user-centered immersive visual experiences. Specifically, a database tailored for Gaussian point clouds is created to investigate the perceptual impact of varying degrees of geometry and appearance distortions. This database comprises participant visual scores collected within an immersive 3D environment, reflecting human perception in 3D immersive applications. Building on this data foundation, a dual-stream objective quality assessment method is developed for Gaussian point clouds. The first stream introduces a cross-domain knowledge distillation module in the 2D observation space, transferring insights from advanced image quality assessment for view-specific evaluations. Concurrently, the second stream directly assesses Gaussian point clouds in 3D modeling space, identifying view-agnostic distortions via local and global aggregation. Experimental results demonstrate that the proposed method outperforms state-of-the-art methods across diverse quality evaluation metrics.
Overall, this thesis contributes to advancing 3D point cloud compression and quality assessment from the three key aspects: 1) Point cloud geometry compression efficacy is improved by leveraging geometric priors. 2) Efficient Gaussian point cloud coding is achieved via intra- and inter-primitive redundancy elimination. 3) Visual quality assessment for compressed Gaussian point clouds is pioneered with a database and a multi-modality metric. Extensive experimental results validate the effectiveness of the proposed methods. This thesis provides promising solutions for efficient storage, transmission and processing of point cloud data, enhancing the capabilities and practicality of 3D applications and services.
| Date of Award | 8 Aug 2025 |
|---|---|
| Original language | English |
| Awarding Institution |
|
| Supervisor | Shiqi WANG (Supervisor) & Tak Wu Sam KWONG (External Co-Supervisor) |