Projects per year
Abstract
Recently, text-to-image denoising diffusion probabilistic models (DDPMs) have demonstrated impressive image generation capabilities and have also been successfully applied to image inpainting. However, in practice, users often require more control over the inpainting process beyond textual guidance, especially when they want to composite objects with customized appearance, color, shape, and layout. Unfortunately, existing diffusion-based inpainting methods are limited to single-modal guidance and require task-specific training, hindering their cross-modal scalability. To address these limitations, we propose Uni-paint, a unified framework for multimodal inpainting that offers various modes of guidance, including unconditional, text-driven, stroke-driven, exemplar-driven inpainting, as well as a combination of these modes. Furthermore, our Uni-paint is based on pretrained Stable Diffusion and does not require task-specific training on specific datasets, enabling few-shot generalizability to customized images. We have conducted extensive qualitative and quantitative evaluations that show our approach achieves comparable results to existing single-modal methods while offering multimodal inpainting capabilities not available in other methods. Code is available at https://github.com/ysy31415/unipaint. © 2023 Copyright held by the owner/author(s). Publication rights licensed to ACM.
Original language | English |
---|---|
Title of host publication | MM '23: Proceedings of the 31st ACM International Conference on Multimedia |
Publisher | Association for Computing Machinery |
Pages | 3190-3199 |
ISBN (Print) | 9798400701085 |
DOIs | |
Publication status | Published - Oct 2023 |
Event | 31st ACM International Conference on Multimedia (MM 2023) - Westin Ottawa, Ottawa, Canada Duration: 29 Oct 2023 → 3 Nov 2023 https://www.acmmm2023.org/accommodation/ |
Publication series
Name | MM - Proceedings of the ACM International Conference on Multimedia |
---|
Conference
Conference | 31st ACM International Conference on Multimedia (MM 2023) |
---|---|
Abbreviated title | MM '23 |
Country/Territory | Canada |
City | Ottawa |
Period | 29/10/23 → 3/11/23 |
Internet address |
Bibliographical note
Full text of this publication does not contain sufficient affiliation information. With consent from the author(s) concerned, the Research Unit(s) information for this record is based on the existing academic department affiliation of the author(s).Funding
This work is supported by GRF grant (Project No. CityU 11208123) from the Research Grants Council (RGC) of Hong Kong. We also thank Unsplash and the photographers for generously sharing their high-quality, free-to-use images used in this research.
Research Keywords
- diffusion model
- image inpainting
- multimodal
Fingerprint
Dive into the research topics of 'Uni-paint: A Unified Framework for Multimodal Image Inpainting with Pretrained Diffusion Model'. Together they form a unique fingerprint.Projects
- 1 Active
-
GRF: Text-to-3D Generation and Manipulation with Neural Radiance Field Representation
LIAO, J. (Principal Investigator / Project Coordinator)
1/01/24 → …
Project: Research