Abstract
Great progress has been made in StyleGAN-based image editing. To associate with preset attributes, most existing approaches focus on supervised learning for semantically meaningful latent space traversal directions, and each manipulation step is typically determined for an individual attribute. To address this limitation, we propose a Text-guided Unsupervised StyleGAN Latent Transformation (TUSLT) model, which adaptively infers a single transformation step in the latent space of StyleGAN to simultaneously manipulate multiple attributes on a given input image. Specifically, we adopt a two-stage architecture for a latent mapping network to break down the transformation process into two manageable steps. Our network first learns a diverse set of semantic directions tailored to an input image, and later nonlinearly fuses the ones associated with the target attributes to infer a residual vector. The resulting tightly interlinked two-stage architecture delivers the flexibility to handle diverse attribute combinations. By leveraging the cross-modal text-image representation of CLIP, we can perform pseudo annotations based on the semantic similarity between preset attribute text descriptions and training images, and further jointly train an auxiliary attribute classifier with the latent mapping network to provide semantic guidance. We perform extensive experiments to demonstrate that the adopted strategies contribute to the superior performance of TUSLT. © 2023 IEEE.
Original language | English |
---|---|
Title of host publication | Proceedings - 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition |
Publisher | IEEE |
Pages | 19285-19294 |
ISBN (Electronic) | 9798350301298 |
ISBN (Print) | 979-8-3503-0130-4 |
DOIs | |
Publication status | Published - 2023 |
Event | 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023) - Vancouver Convention Center, Vancouver, Canada Duration: 18 Jun 2023 → 22 Jun 2023 https://cvpr2023.thecvf.com/Conferences/2023 https://openaccess.thecvf.com/menu https://ieeexplore.ieee.org/xpl/conhome/1000147/all-proceedings |
Publication series
Name | Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition |
---|---|
ISSN (Print) | 1063-6919 |
ISSN (Electronic) | 2575-7075 |
Conference
Conference | 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023) |
---|---|
Abbreviated title | CVPR2023 |
Country/Territory | Canada |
City | Vancouver |
Period | 18/06/23 → 22/06/23 |
Internet address |
Research Keywords
- Image and video synthesis and generation