Self-Calibration Flow Guided Denoising Diffusion Model for Human Pose Transfer
Research output: Journal Publications and Reviews › RGC 21 - Publication in refereed journal › peer-review
Author(s)
Related Research Unit(s)
Detail(s)
Original language | English |
---|---|
Pages (from-to) | 7896-7911 |
Number of pages | 16 |
Journal / Publication | IEEE Transactions on Circuits and Systems for Video Technology |
Volume | 34 |
Issue number | 9 |
Online published | 28 Mar 2024 |
Publication status | Published - Sept 2024 |
Link(s)
Abstract
The human pose transfer task aims to generate synthetic person images that preserve the style of reference images while accurately aligning them with the desired target pose. However, existing methods based on generative adversarial networks (GANs) struggle to produce realistic details and often face spatial misalignment issues. On the other hand, methods relying on denoising diffusion models require a large number of model parameters, resulting in slower convergence rates. To address these challenges, we propose a self-calibration flow-guided module (SCFM) to establish precise spatial correspondence between reference images and target poses. This module facilitates the denoising diffusion model in predicting the noise at each denoising step more effectively. Additionally, we introduce a multi-scale feature fusing module (MSFF) that enhances the denoising U-Net architecture through a cross-attention mechanism, achieving better performance with a reduced parameter count. Our proposed model outperforms state-of-the-art methods on the DeepFashion and Market-1501 datasets in terms of both the quantity and quality of the synthesized images. Our code is publicly available at https://github.com/zylwithxy/SCFM-guided-DDPM. © 2024 IEEE.
Research Area(s)
- Circuits and systems, Correlation, Diffusion models, Human pose transfer, Image synthesis, Multi-scale feature fusing module, Noise reduction, Optical flow, Self-Calibration flow module, Task analysis, Training
Citation Format(s)
Self-Calibration Flow Guided Denoising Diffusion Model for Human Pose Transfer. / Xue, Yu; Po, Lai-Man; Yu, Wing-Yin et al.
In: IEEE Transactions on Circuits and Systems for Video Technology, Vol. 34, No. 9, 09.2024, p. 7896-7911.
In: IEEE Transactions on Circuits and Systems for Video Technology, Vol. 34, No. 9, 09.2024, p. 7896-7911.
Research output: Journal Publications and Reviews › RGC 21 - Publication in refereed journal › peer-review