Unsupervised Learning for Image Enhancement


Student thesis: Doctoral Thesis

View graph of relations


Related Research Unit(s)


Awarding Institution
Award date10 Aug 2021


The advent of the era of big data has brought diversification and ubiquity of data acquisition devices. Images inevitably suffer from a wide variety of degradations in visual quality, such as color casting, low contrast, and intensive noise, which are introduced by bad weather, limitations of hardware devices, and lack of professional photography skills. As a result, image enhancement is highly desired by the general public who lacks professional image-editing skills and it can also benefit high-level computer vision tasks. Therefore, in this thesis, we investigate the image enhancement problem based on the unsupervised learning to improve the aesthetic quality of general images and the visual quality of low-light images. We hope that our research can bring new insights to improve the practicability and robustness of image and video enhancement in real scenarios, and is expected to improve the performance of various applications in the artificial intelligence era. The work consists of three parts and is succinctly summarized as follows.

The first part focuses on the user-oriented general image aesthetic quality enhancement based on the bidirectional Generative adversarial network (GAN). We propose a quality attention generative adversarial network (QAGAN) that can be trained on the unpaired data in an unsupervised manner. The key novelty of the proposed QAGAN lies in the quality attention module (QAM) injected into the generator so that the generator can learn domain-relevant quality attention directly from the two domains. More specifically, the proposed QAM allows the generator to select semantically related characteristics effectively based on the spatial correlation and adaptively incorporate style-related attributes based on the channel correlation, respectively. Extensive experimental results demonstrate that our proposed method is better than state-of-the-art methods in quantitative and qualitative.

The second part investigates the unsupervised general image aesthetic quality enhancement based on a unidirectional GAN model. We propose an unsupervised image enhancement generative adversarial network (UEGAN), which learns the corresponding image-to-image mapping from a set of images with desired characteristics in an unsupervised manner, rather than learning on a large number of paired low/high-quality images. The proposed UEGAN is based on a unidirectional GAN that embeds the modulation and attention mechanisms to capture richer global and local features. Based on the proposed model, we introduce two losses to deal with the unsupervised image enhancement: (1) fidelity loss, which is defined as an l2 regularization in the feature domain of a pre-trained VGG network to ensure the consistency of the content between the enhanced the input images, and (2) quality loss, which is formulated as a relativistic hinge adversarial loss to help obtain the enhanced results with the desired characteristics. Both quantitative and qualitative results show that the proposed model effectively improves the aesthetic quality of images.

The third part studies the unsupervised low-light image enhancement based on bidirectional GAN. Getting rid of the fundamental limitations in fitting to the paired training data, recent unsupervised low-light enhancement methods excel in adjusting illumination and contrast of images. However, for unsupervised low light enhancement, the remaining noise suppression issue due to the lacking of supervision of detailed signal largely impedes the wide deployment of these methods in real-world applications. We propose a novel Cycle-Interactive Generative Adversarial Network (CIGAN) for unsupervised lowlight image enhancement, which is capable of not only better transferring illumination distributions between low/normal-light images but also manipulating detailed signals between two domains. In particular, the proposed low-light guided transformation (LGT) feed-forwards the features of low-light images from the generator of enhancement GAN (eGAN) into the generator of degradation GAN (dGAN) to synthesize more realistic lowlight images with diverse illumination and contrast. Moreover, the feature randomized perturbation (FRP) module in dGAN learns to increase the feature randomness to produce diverse feature distributions, persuading the synthesized low-light images to contain realistic noise. Extensive experiments demonstrate both the superiority of the proposed method and the effectiveness of each module in CIGAN.

    Research areas

  • unsupervised learning, image enhancement, generative adversarial network