Learning to Generate Single or Multiple Domain Images

單域或多域圖片的生成學習

Student thesis: Doctoral Thesis

View graph of relations

Author(s)

Related Research Unit(s)

Detail(s)

Awarding Institution
Supervisors/Advisors
Award date18 Dec 2018

Abstract

Deep learning has launched a profound reformation and even been applied to many real-world tasks such as image classification and object detection. These tasks obviously fall into the scope of supervised learning, which means that a lot of labeled data is provided for the learning processes. Compared with supervised learning, however, unsupervised learning obtains limited impact from deep learning. Image generation is a typical problem in unsupervised learning, the goal of which is to learn the distribution over images and then to generate images by sampling from the learned distribution. Image generation is an important and fundamental problem in computer vision, and the recent success of image generation achieved by deep generative models has driven the progress of numerous computer vision tasks such as image super- resolution, image-to-image translation, and semi-supervised learning. Multi-domain image generation is an extension of image generation, which aims to generate aligned image pairs of different domains. It also has many promising applications such as improving the generated image quality, image-to-image translation, and unsupervised domain adaptation. In this thesis, we investigate several approaches for image generation and multi-domain image generation. Our methods are all based on GANs.

We first introduce GAN-based models for image generation. Regular GANs hypothesize the discriminator as a classifier with the sigmoid cross entropy loss function. However, we found that this loss function may lead to the vanishing gradients problem during the learning process. To address this, we propose the least squares generative adversarial networks which adopt the least squares loss for both the discriminator and the generator. We also investigate the problem of improving the training stability of LSGANs. Then we introduce how to apply our proposed model to the application of Chinese characters recognition.

We then present a model called Regularized Conditional GAN (RegCGAN) for multi-domain image generation. Directly using conditional GAN for multi-domain image generation will fail to learn the corresponding semantics. To overcome this problem, we propose two regularizers to guide the model to encode the common semantics in the shared latent variables and encode the domain-specific semantics in the domain variables, which in turn makes the model to generate corresponding images. We also introduce an approach of applying RegCGAN to unsupervised domain adaptation.