Hiding Data in Generative Models

Student thesis: Doctoral Thesis

Abstract

Data hiding, encompassing both steganography and watermarking, refers to the process of embedding secret information within a medium or transforming secret information into another medium. Like many domains in signal and image processing, data hiding has experienced a transformative evolution with the rise of deep neural networks (DNNs). Despite recent advances, existing DNN-based data hiding techniques still face notable limitations, which differ depending on the application—steganography or watermarking—due to their distinct goals and requirements.

The first part of the thesis addresses the limitations of standard DNN-based steganography methods. We propose a probabilistic image-hiding framework that embeds a secret image into a specific region of the probability density distribution of the cover image by training generative models from scratch. Specifically, we instantiate this framework using SinGAN, a pyramid of generative adversarial networks (GANs) that learns the patch distribution of a single image. The secret is embedded by learning a deterministic mapping from a fixed set of noise maps—generated using an embedding key—to the secret image during patch distribution learning. The resulting stego SinGAN preserves the generative capabilities of the original model and can be publicly shared; only recipients with the embedding key can successfully extract the hidden image, ensuring secure and covert communication.

While this probabilistic approach is effective for certain generative models, it becomes impractical for others, such as diffusion models, due to the high computational cost of training from scratch. To address this, we propose a novel framework for embedding secret images into diffusion models at a specific denoising timestep. This is achieved by refining the learned score function through a hybrid parameter-efficient fine-tuning (PEFT) strategy that combines selective fine-tuning with reparameterized fine-tuning techniques. Our approach enhances extraction accuracy, model fidelity, and hiding efficiency over the existing methods, making diffusion-based neural network steganography more practical and scalable.

Despite the effectiveness of the proposed methods in neural network steganography, they lack the robustness required for watermarking applications. Accordingly, the second part of the thesis focuses on robust data hiding in generative models and presents two distinct frameworks: one for ownership verification and another for generated content attribution. For ownership verification, we propose a generative classifier-based watermarking framework for diffusion models. Unlike prior methods, our approach embeds watermarks after constructing a diffusion classifier, integrating the watermark directly into the generative process without requiring external decoders. We introduce sensitivity metrics to quantify the robustness of the watermark against model weight modifications and adopt a bi-level optimization strategy to jointly refine the learned score function and trigger image. This ensures that the watermark remains both imperceptible and robust to weight perturbations.

While the generative classifier-based framework effectively addresses the ownership verification of generative models, it does not handle the emerging challenge of generated content attribution, which is critical for mitigating the misuse of AI-generated media. Existing watermarking techniques for generated content attribution are often incompatible with new state-of-the-art architectures, particularly visual autoregressive (VAR) models. To bridge this gap, we propose a two-stage watermarking framework specifically tailored for VAR models. This method ensures that watermark signals are embedded in generated images through the standard generation process without any explicit watermark injection. The embedded watermark demonstrates strong robustness and remains extractable despite image distortions, offering a dependable and practical solution for generated content attribution.

In summary, this thesis presents a probabilistic image-hiding framework to overcome limitations in standard DNN-based steganography, an efficient diffusion-based neural network steganography method utilizing PEFT for improved practicality, a robust data hiding framework for diffusion models for ownership verification using generative classifiers, and a two-stage watermarking approach for generated content attribution tailored to visual autoregressive models. Collectively, these contributions advance the field of data hiding and provide practical solutions to generative neural network-based steganography and watermarking applications.
Date of Award28 Aug 2025
Original languageEnglish
Awarding Institution
  • City University of Hong Kong
SupervisorKede MA (Supervisor) & Linqi SONG (Co-supervisor)

Cite this

'