Unifying Multi-modal Hair Editing via Proxy Feature Blending

Tianyi Wei, Dongdong Chen*, Wenbo Zhou, Jing Liao, Can Wang, Weiming Zhang, Gang Hua, Nenghai Yu

*Corresponding author for this work

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

Abstract

Hair editing is a long-standing problem in computer vision that demands both fine-grained local control and intuitive user interactions across diverse modalities. Despite the remarkable progress of GANs and diffusion models, existing methods still lack a unified framework that simultaneously supports arbitrary interaction modes (e.g., text, sketch, mask, and reference image) while ensuring precise editing and faithful preservation of irrelevant attributes. In this work, we introduce a novel paradigm that reformulates hair editing as proxy-based hair transfer. Specifically, we leverage the dense and semantically disentangled latent space of StyleGAN for precise manipulation and exploit its feature space for disentangled attribute preservation, thereby decoupling the objectives of editing and preservation. Our framework unifies different modalities by converting editing conditions into distinct transfer proxies, whose features are seamlessly blended to achieve global or local edits. Beyond 2D, we extend our paradigm to 3D-aware settings by incorporating EG3D and PanoHead, where we propose a multi-view boosted hair feature localization strategy together with 3D-tailored proxy generation methods that exploit the inherent properties of 3D-aware generative models. Extensive experiments demonstrate that our method consistently outperforms prior approaches in editing effects, attribute preservation, visual naturalness, and multi-view consistency, while offering unprecedented support for multimodal and mixed-modal interactions. © 1979-2012 IEEE.
Original languageEnglish
Number of pages18
JournalIEEE Transactions on Pattern Analysis and Machine Intelligence
DOIs
Publication statusOnline published - 21 Jan 2026

Funding

ACKNOWLEDGMENTS This work was supported in part by the Natural Science Foundation of China under Grants 62372423, 62121002, 62072421, and was also supported by the Fundamental Research Funds for the Central Universities WK2100250070. This work was partially supported by an ITF grant (ITS/269/24FP) from the Innovation and Technology Commission (ITC) of Hong Kong. Thank Yi Yin for her help in this work. The authors would like to thank the Associate Editor and reviewers for their valuable comments and suggestions.

Research Keywords

  • Multi-modal
  • Proxy Feature Blending
  • Unified Hair editing

Fingerprint

Dive into the research topics of 'Unifying Multi-modal Hair Editing via Proxy Feature Blending'. Together they form a unique fingerprint.

Cite this