Abstract
The domain of 3D virtual modeling for real-world objects has been a central research area in computer vision and graphics. Traditional manual modeling typically involves mesh representations, constructing 3D models from reflectance attributes like materials in a bottom-up approach. While offering flexibility in modification, this process often demands substantial manual effort, posing challenges in efficiently acquiring highly detailed 3D models. In recent years, the rapid development of 3D reconstruction mechanisms significantly reduces the barrier to acquiring 3D assets. Among these methods, Neural Radiance Field (NeRF) and 3D Gaussian Splatting (3D GS) have emerged as two of the most prominent technologies. Their key success lies in the utilization of radiance fields for 3D representation, which demonstrate robust modeling capabilities and enable precise reconstruction of object details in 3D models. However, these representations tend to simplify the modeling process by focusing on geometry and appearance rather than differentiating reflectance attributes, hindering precise 3D content editing that involves object attributes. In this paper, we propose two methods to address this editing inefficiency in radiance fields, focusing on attribute editing of existing objects and object editing in reconstructed scenes.In attribute editing of existing objects, the lack of decomposed reflectance attributes hampers the precise editing of object material and illumination. To support reflectance decomposition and subsequent editing, recent advanced methods incorporate inverse rendering into the modeling process. Additionally, they employ a variant of the radiance field, termed neural reflectance field, to model the decomposed factors of objects, thereby facilitating further editing. However, these methods often rely on continuous representations, neglecting the discrete nature of real-world objects, which can result in noisy material decomposition and complex editing procedures. To address this limitation, we introduce VQ-NeRF, the first neural reflectance field enabling discrete material decomposition and editing in 3D scenes. By incorporating Vector Quantization (VQ) into reflectance decomposition, continuous materials are discretized, reducing noise in predicted materials and simplifying the selection of specific materials for editing.
In the realm of object editing within 3D scenes, such as generating and inserting new objects into reconstructed scenes, the lack of reflectance attributes in radiance fields hampers the use of scene illumination to render the generated 3D models, disrupting the harmony between generated objects and the scene background. To remedy this challenge, we advocate for exploiting the prior of the Stable Diffusion Inpainting model to craft the newly generated objects in accordance with the surrounding scene context. Our method, MVInpainter, integrates a multi-view diffusion inpainting model derived from a pre-trained stable video diffusion model to ensure consistent object inpainting from multiple viewpoints. By incorporating the multi-view inpainting prior, our method ensures view-consistent and harmonious generative object insertions within 3D Gaussian Splatting scenes.
We conduct experimental evaluations of these methods on various datasets, comparing their effectiveness with state-of-the-art techniques. Through quantitative and qualitative analyses, we validate that our proposed approaches can achieve accurate, controllable, and high-quality 3D content editing at both the attribute and object levels within radiance fields, showcasing their advanced capabilities.
| Date of Award | 21 Mar 2025 |
|---|---|
| Original language | English |
| Awarding Institution |
|
| Supervisor | Jing LIAO (Supervisor) |