Exposing fake images generated by text-to-image diffusion models

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

8 Scopus Citations
View graph of relations

Author(s)

  • Qiang Xu
  • Hao Wang
  • Laijin Meng
  • Zhongjie Mi
  • Jianye Yuan

Related Research Unit(s)

Detail(s)

Original languageEnglish
Pages (from-to)76-82
Journal / PublicationPattern Recognition Letters
Volume176
Online published28 Oct 2023
Publication statusPublished - Dec 2023

Abstract

Text-to-image diffusion models (DM) have posed unprecedented challenges to the authenticity and integrity of digital images, which makes the detection of computer-generated images one of the most important image forensics techniques. However, the detection of images generated by text-to-image diffusion models is rarely reported in the literature. To tackle this issue, we first analyze the acquisition process of DM images. Then, we construct a hybrid neural network based on attention-guided feature extraction (AGFE) and vision transformers (ViTs)-based feature extraction (ViTFE) modules. An attention mechanism is adopted in the AGFE module to capture long-range feature interactions and boost the representation capability. ViTFE module containing sequential MobileNetv2 block (MNV2) and MobileViT blocks are designed to learn global representations. By conducting extensive experiments on different types of generated images, the results demonstrate the effectiveness and robustness of our method in exposing fake images generated by text-to-image diffusion models. © 2023 Elsevier B.V.

Research Area(s)

  • Attention mechanism, Diffusion models (DM), Image forensics, Text-to-image, Vision transformers (ViTs)

Citation Format(s)

Exposing fake images generated by text-to-image diffusion models. / Xu, Qiang; Wang, Hao; Meng, Laijin et al.
In: Pattern Recognition Letters, Vol. 176, 12.2023, p. 76-82.

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review