Phidias : A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion

Research output: Conference PapersPosterpeer-review

View graph of relations

Author(s)

Related Research Unit(s)

Detail(s)

Original languageEnglish
Publication statusPublished - 24 Apr 2025

Conference

TitleThe Thirteenth International Conference on Learning Representations
LocationSingapore EXPO
PlaceSingapore
Period24 - 28 April 2025

Abstract

Generative 3D modeling has made significant advances recently, but it remains constrained by its inherently ill-posed nature, leading to challenges in quality and controllability. Inspired by the real-world workflow that designers typically refer to existing 3D models when creating new ones, we propose Phidias, a novel generative model that uses diffusion for reference-augmented 3D generation. Given an image, our method leverages a retrieved or user-provided 3D reference model to guide the generation process, thereby enhancing the generation quality, generalization ability, and controllability. Phidias integrates three key components: 1) meta-ControlNet to dynamically modulate the conditioning strength, 2) dynamic reference routing to mitigate misalignment between the input image and 3D reference, and 3) self-reference augmentations to enable self-supervised training with a progressive curriculum. Collectively, these designs result in significant generative improvements over existing methods. Phidias forms a unified framework for 3D generation using text, image, and 3D conditions, offering versatile applications.

Bibliographic Note

Since this conference is yet to commence, the information for this record is subject to revision.

Citation Format(s)

Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion. / Wang, Zhenwei; Wang, Tengfei; He, Zexin et al.
2025. Poster session presented at The Thirteenth International Conference on Learning Representations, Singapore.

Research output: Conference PapersPosterpeer-review