Abstract
The prevalence of multimedia applications has led to increased concerns and demand for auto face retouching. Face retouching aims to enhance portrait quality by removing blemishes. However, the existing auto-retouching methods rely heavily on a large amount of paired training samples, and perform less satisfactorily when handling complex and unusual blemishes. To address this issue, we propose a Language-guided Blemish Removal Transformer for automatically retouching face images, while at the same time reducing the dependency of the model on paired training data. Our model is referred to as LangBRT, which leverages vision-language pre-training for precise facial blemish removal. Specifically, we design a text-prompted blemish detection module that indicates the regions to be edited. The priors not only enable the transformer network to handle specific blemishes in certain areas, but also reduce the reliance on retouching training data. Further, we adopt a target-aware cross attention mechanism, such that the blemish-like regions are edited accurately while at the same time maintaining the normal skin regions unchanged. Finally, we adopt a regularization approach to encourage the semantic consistency between the synthesized image and the text description of the desired retouching outcome. Extensive experiments are performed to demonstrate the superior performance of LangBRT over competing auto-retouching methods in terms of dependency on training data, blemish detection accuracy and synthesis quality. © 2024 Copyright held by the owner/author(s). Publication rights licensed to ACM.
| Original language | English |
|---|---|
| Title of host publication | MM ’24 |
| Subtitle of host publication | Proceedings of the 32nd ACM International Conference on Multimedia |
| Publisher | Association for Computing Machinery |
| Pages | 5102-5111 |
| ISBN (Print) | 9798400706868 |
| DOIs | |
| Publication status | Published - 2024 |
| Event | 32nd ACM International Conference on Multimedia (MM 2024) - Melbourne, Australia Duration: 28 Oct 2024 → 1 Nov 2024 https://2024.acmmm.org/ |
Publication series
| Name | MM - Proceedings of the ACM International Conference on Multimedia |
|---|
Conference
| Conference | 32nd ACM International Conference on Multimedia (MM 2024) |
|---|---|
| Abbreviated title | ACM MM’24 |
| Place | Australia |
| City | Melbourne |
| Period | 28/10/24 → 1/11/24 |
| Internet address |
Bibliographical note
Full text of this publication does not contain sufficient affiliation information. With consent from the author(s) concerned, the Research Unit(s) information for this record is based on the existing academic department affiliation of the author(s).Research Keywords
- blemish detection
- face retouching
- transformer
- vision-language pre-training
Fingerprint
Dive into the research topics of 'Hunting Blemishes: Language-guided High-fidelity Face Retouching Transformer with Limited Paired Data'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver