Self-Calibration Flow Guided Denoising Diffusion Model for Human Pose Transfer

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

2 Scopus Citations
View graph of relations

Related Research Unit(s)

Detail(s)

Original languageEnglish
Pages (from-to)7896-7911
Number of pages16
Journal / PublicationIEEE Transactions on Circuits and Systems for Video Technology
Volume34
Issue number9
Online published28 Mar 2024
Publication statusPublished - Sept 2024

Abstract

The human pose transfer task aims to generate synthetic person images that preserve the style of reference images while accurately aligning them with the desired target pose. However, existing methods based on generative adversarial networks (GANs) struggle to produce realistic details and often face spatial misalignment issues. On the other hand, methods relying on denoising diffusion models require a large number of model parameters, resulting in slower convergence rates. To address these challenges, we propose a self-calibration flow-guided module (SCFM) to establish precise spatial correspondence between reference images and target poses. This module facilitates the denoising diffusion model in predicting the noise at each denoising step more effectively. Additionally, we introduce a multi-scale feature fusing module (MSFF) that enhances the denoising U-Net architecture through a cross-attention mechanism, achieving better performance with a reduced parameter count. Our proposed model outperforms state-of-the-art methods on the DeepFashion and Market-1501 datasets in terms of both the quantity and quality of the synthesized images. Our code is publicly available at https://github.com/zylwithxy/SCFM-guided-DDPM. © 2024 IEEE.

Research Area(s)

  • Circuits and systems, Correlation, Diffusion models, Human pose transfer, Image synthesis, Multi-scale feature fusing module, Noise reduction, Optical flow, Self-Calibration flow module, Task analysis, Training