Skip to main navigation Skip to search Skip to main content

Hunting Blemishes: Language-guided High-fidelity Face Retouching Transformer with Limited Paired Data

  • Le Jiang
  • , Yan Huang*
  • , Lianxin Xie
  • , Wen Xue
  • , Cheng Liu
  • , Si Wu*
  • , Hau-San Wong
  • *Corresponding author for this work

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

Abstract

The prevalence of multimedia applications has led to increased concerns and demand for auto face retouching. Face retouching aims to enhance portrait quality by removing blemishes. However, the existing auto-retouching methods rely heavily on a large amount of paired training samples, and perform less satisfactorily when handling complex and unusual blemishes. To address this issue, we propose a Language-guided Blemish Removal Transformer for automatically retouching face images, while at the same time reducing the dependency of the model on paired training data. Our model is referred to as LangBRT, which leverages vision-language pre-training for precise facial blemish removal. Specifically, we design a text-prompted blemish detection module that indicates the regions to be edited. The priors not only enable the transformer network to handle specific blemishes in certain areas, but also reduce the reliance on retouching training data. Further, we adopt a target-aware cross attention mechanism, such that the blemish-like regions are edited accurately while at the same time maintaining the normal skin regions unchanged. Finally, we adopt a regularization approach to encourage the semantic consistency between the synthesized image and the text description of the desired retouching outcome. Extensive experiments are performed to demonstrate the superior performance of LangBRT over competing auto-retouching methods in terms of dependency on training data, blemish detection accuracy and synthesis quality. © 2024 Copyright held by the owner/author(s). Publication rights licensed to ACM.
Original languageEnglish
Title of host publicationMM ’24
Subtitle of host publicationProceedings of the 32nd ACM International Conference on Multimedia
PublisherAssociation for Computing Machinery
Pages5102-5111
ISBN (Print)9798400706868
DOIs
Publication statusPublished - 2024
Event32nd ACM International Conference on Multimedia (MM 2024) - Melbourne, Australia
Duration: 28 Oct 20241 Nov 2024
https://2024.acmmm.org/

Publication series

NameMM - Proceedings of the ACM International Conference on Multimedia

Conference

Conference32nd ACM International Conference on Multimedia (MM 2024)
Abbreviated titleACM MM’24
PlaceAustralia
CityMelbourne
Period28/10/241/11/24
Internet address

Bibliographical note

Full text of this publication does not contain sufficient affiliation information. With consent from the author(s) concerned, the Research Unit(s) information for this record is based on the existing academic department affiliation of the author(s).

Research Keywords

  • blemish detection
  • face retouching
  • transformer
  • vision-language pre-training

Fingerprint

Dive into the research topics of 'Hunting Blemishes: Language-guided High-fidelity Face Retouching Transformer with Limited Paired Data'. Together they form a unique fingerprint.

Cite this