Diverse Semantic Image Synthesis with various conditioning modalities
Research output: Journal Publications and Reviews › RGC 21 - Publication in refereed journal › peer-review
Author(s)
Related Research Unit(s)
Detail(s)
Original language | English |
---|---|
Article number | 112727 |
Journal / Publication | Knowledge-Based Systems |
Volume | 309 |
Online published | 19 Nov 2024 |
Publication status | Online published - 19 Nov 2024 |
Link(s)
Abstract
Semantic image synthesis aims to generate high-fidelity images from a segmentation mask, and previous methods typically train a generator to associate a global random map with the conditioning mask. However, the lack of independent control of regional content impedes their application. To address this issue, we propose an effective approach for Multi-modal conditioning-based Diverse Semantic Image Synthesis, which is referred to as McDSIS. In this model, there are a number of constituent generators incorporated to synthesize the content in semantic regions from independent random maps. The regional content can be determined by the style code associated with a random map, extracted from a reference image, or by embedding a textual description via our proposed conditioning mechanisms. As a result, the generation process is spatially disentangled, which facilitates independent synthesis of diverse content in a semantic region, while at the same time preserving other content. Due to this flexible architecture, in addition to achieving superior performance over state-of-the-art semantic image generation models, McDSIS is capable of performing various visual tasks, such as face inpainting, swapping, local editing, etc. © 2024 Elsevier B.V.
Research Area(s)
- Constituent generators, Multi-modal conditioning-based editing, Semantic image synthesis, Spatially disentangled synthesis
Citation Format(s)
Diverse Semantic Image Synthesis with various conditioning modalities. / Wu, Chaoyue; Li, Rui; Liu, Cheng et al.
In: Knowledge-Based Systems, Vol. 309, 112727, 30.01.2025.
In: Knowledge-Based Systems, Vol. 309, 112727, 30.01.2025.
Research output: Journal Publications and Reviews › RGC 21 - Publication in refereed journal › peer-review