TY - JOUR
T1 - Gaussian Surfel Splatting for Live Human Performance Capture
AU - Dong, Zheng
AU - Xu, Ke
AU - Gao, Yaoan
AU - Bao, Hujun
AU - Xu, Weiwei
AU - Lau, Rynson W. H.
N1 - Full text of this publication does not contain sufficient affiliation information. With consent from the author(s) concerned, the Research Unit(s) information for this record is based on the existing academic department affiliation of the author(s).
PY - 2024/12
Y1 - 2024/12
N2 - High-quality real-time rendering using user-affordable capture rigs is an essential property of human performance capture systems for real-world applications. However, state-of-the-art performance capture methods may not yield satisfactory rendering results under a very sparse (e.g., four) capture setting. Specifically, neural radiance field (NeRF)-based methods and 3D Gaussian Splatting (3DGS)-based methods tend to produce local geometry errors for unseen performers, while occupancy field (PIFu)-based methods often produce unrealistic rendering results. In this paper, we propose a novel generalizable neural approach to reconstruct and render the performers from very sparse RGBD streams in high quality. The core of our method is a novel point-based generalizable human (PGH) representation conditioned on the pixel-aligned RGBD features. The PGH representation learns a surface implicit function for the regression of surface points and a Gaussian implicit function for parameterizing the radiance fields of the regressed surface points with 2D Gaussian surfels, and uses surfel splatting for fast rendering. We learn this hybrid human representation via two novel networks. First, we propose a novel point-regressing network (PRNet) with a depth-guided point cloud initialization (DPI) method to regress an accurate surface point cloud based on the denoised depth information. Second, we propose a novel neural blending-based surfel splatting network (SPNet) to render high-quality geometries and appearances in novel views based on the regressed surface points and high-resolution RGBD features of adjacent views. Our method produces free-view human performance videos of 1K resolution at 12 fps on average. Experiments on two benchmarks show that our method outperforms state-of-the-art human performance capture methods. © 2024 Copyright is held by the owner/author(s). Publication rights licensed to ACM.
AB - High-quality real-time rendering using user-affordable capture rigs is an essential property of human performance capture systems for real-world applications. However, state-of-the-art performance capture methods may not yield satisfactory rendering results under a very sparse (e.g., four) capture setting. Specifically, neural radiance field (NeRF)-based methods and 3D Gaussian Splatting (3DGS)-based methods tend to produce local geometry errors for unseen performers, while occupancy field (PIFu)-based methods often produce unrealistic rendering results. In this paper, we propose a novel generalizable neural approach to reconstruct and render the performers from very sparse RGBD streams in high quality. The core of our method is a novel point-based generalizable human (PGH) representation conditioned on the pixel-aligned RGBD features. The PGH representation learns a surface implicit function for the regression of surface points and a Gaussian implicit function for parameterizing the radiance fields of the regressed surface points with 2D Gaussian surfels, and uses surfel splatting for fast rendering. We learn this hybrid human representation via two novel networks. First, we propose a novel point-regressing network (PRNet) with a depth-guided point cloud initialization (DPI) method to regress an accurate surface point cloud based on the denoised depth information. Second, we propose a novel neural blending-based surfel splatting network (SPNet) to render high-quality geometries and appearances in novel views based on the regressed surface points and high-resolution RGBD features of adjacent views. Our method produces free-view human performance videos of 1K resolution at 12 fps on average. Experiments on two benchmarks show that our method outperforms state-of-the-art human performance capture methods. © 2024 Copyright is held by the owner/author(s). Publication rights licensed to ACM.
KW - gaussian implicit function
KW - human performance capture
KW - point-based generalizable human representation
KW - surface implicit function
UR - http://www.scopus.com/inward/record.url?scp=85210102360&partnerID=8YFLogxK
UR - https://www.scopus.com/record/pubmetrics.uri?eid=2-s2.0-85210102360&origin=recordpage
U2 - 10.1145/3687993
DO - 10.1145/3687993
M3 - RGC 21 - Publication in refereed journal
SN - 0730-0301
VL - 43
JO - ACM Transactions on Graphics
JF - ACM Transactions on Graphics
IS - 6
M1 - 263
ER -