Skip to main navigation Skip to search Skip to main content

TextField3d: Towards Enhancing Open-Vocabulary 3D Generation with Noisy Text Fields

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

Abstract

Generative models have shown remarkable progress in 3D aspect. Recent works learn 3D representation explicitly under text-3D guidance. However, limited text-3D data restricts the vocabulary scale and text control of generations. Generators may easily fall into a stereotype concept for certain text prompts, thus losing open-vocabulary generation ability. To tackle this issue, we introduce a conditional 3D generative model, namely TextField3D.Specifically, rather than using the text prompts as input directly, we suggest to inject dynamic noise into the latent space of given text prompts, i.e., Noisy Text Fields (NTFs). In this way, limited 3D data can be mapped to the appropriate range of textual latent space that is expanded by NTFs. To this end, an NTFGen module is proposed to model general text latent code in noisy fields. Meanwhile, an NTFBind module is proposed to align view-invariant image latent code to noisy fields, further supporting image-conditional 3D generation. To guide the conditional generation in both geometry and texture, multi-modal discrimination is constructed with a text-3D discriminator and a text-2.5D discriminator. Compared to previous methods, TextField3D includes three merits: 1) large vocabulary, 2) text consistency, and 3) low latency. Extensive experiments demonstrate that our method achieves a potential open-vocabulary 3D generation capability. © 2024 12th International Conference on Learning Representations, ICLR 2024. All rights reserved.
Original languageEnglish
Title of host publicationThe Twelfth International Conference on Learning Representations, ICLR 2024
PublisherInternational Conference on Learning Representations, ICLR
Number of pages23
Publication statusPublished - 2024
Event12th International Conference on Learning Representations (ICLR 2024) - Messe Wien Exhibition and Congress Center, Vienna, Austria
Duration: 7 May 202411 May 2024
https://iclr.cc/Conferences/2024
https://openreview.net/group?id=ICLR.cc/2024/Conference

Publication series

NameInternational Conference on Learning Representations, ICLR

Conference

Conference12th International Conference on Learning Representations (ICLR 2024)
PlaceAustria
CityVienna
Period7/05/2411/05/24
Internet address

Funding

This work was supported by National Key RD Program of China under Grant No. 2021ZD0112100, and the National Natural Science Foundation of China (NSFC) under Grant No. U19A2073.

Fingerprint

Dive into the research topics of 'TextField3d: Towards Enhancing Open-Vocabulary 3D Generation with Noisy Text Fields'. Together they form a unique fingerprint.

Cite this