Phidias : A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion
Research output: Conference Papers › Poster › peer-review
Author(s)
Related Research Unit(s)
Detail(s)
Original language | English |
---|---|
Publication status | Published - 24 Apr 2025 |
Conference
Title | The Thirteenth International Conference on Learning Representations |
---|---|
Location | Singapore EXPO |
Place | Singapore |
Period | 24 - 28 April 2025 |
Link(s)
Permanent Link | https://scholars.cityu.edu.hk/en/publications/publication(c5ed9675-5f73-4282-8eea-605870384925).html |
---|
Abstract
Generative 3D modeling has made significant advances recently, but it remains constrained by its inherently ill-posed nature, leading to challenges in quality and controllability. Inspired by the real-world workflow that designers typically refer to existing 3D models when creating new ones, we propose Phidias, a novel generative model that uses diffusion for reference-augmented 3D generation. Given an image, our method leverages a retrieved or user-provided 3D reference model to guide the generation process, thereby enhancing the generation quality, generalization ability, and controllability. Phidias integrates three key components: 1) meta-ControlNet to dynamically modulate the conditioning strength, 2) dynamic reference routing to mitigate misalignment between the input image and 3D reference, and 3) self-reference augmentations to enable self-supervised training with a progressive curriculum. Collectively, these designs result in significant generative improvements over existing methods. Phidias forms a unified framework for 3D generation using text, image, and 3D conditions, offering versatile applications.
Bibliographic Note
Since this conference is yet to commence, the information for this record is subject to revision.
Citation Format(s)
Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion. / Wang, Zhenwei; Wang, Tengfei; He, Zexin et al.
2025. Poster session presented at The Thirteenth International Conference on Learning Representations, Singapore.
2025. Poster session presented at The Thirteenth International Conference on Learning Representations, Singapore.
Research output: Conference Papers › Poster › peer-review