Deep Unsupervised Visual Domain Adaptation via Discriminative Adversarial Learning
基於辦別式對抗學習的深度無監督視覺域適應
Student thesis: Doctoral Thesis
Author(s)
Related Research Unit(s)
Detail(s)
Awarding Institution | |
---|---|
Supervisors/Advisors |
|
Award date | 30 Jul 2021 |
Link(s)
Permanent Link | https://scholars.cityu.edu.hk/en/theses/theses(7e3de2e5-d64e-40ca-9373-950ec47ac42a).html |
---|---|
Other link(s) | Links |
Abstract
Computer vision tasks, such as image classification and object recognition, have witnessed huge successes in recent years. In large part, this is attributed to the impressive progress in deep neural networks (DNNs). In DNNs, the quantity of labeled training examples plays a crucial role during the training process. In addition, DNNs implicitly assume the testing and training data are sampled from the same distribution. This implicit assumption can be easily violated in practice, and the corresponding performance can be drastically damaged on the testing data. Unsupervised domain adaptation (UDA) is introduced to facilitate knowledge transfer from richly labeled data (source data) to completely unlabeled data (target data). Both data types share the same label space but may follow different distributions, which is known as domain shift. UDA aims to enhance model performance on target samples in the presence of domain shift problem.
To alleviate the domain shift problem, previous works have focused on aligning the marginal distribution of both domains while ensuring the capability of the trained model to show highly discriminative performance on source data. Distance metrics and adversarial training are the prevalent techniques used for conducting domain alignment. However, marginal distribution alignment is insufficient to preclude the decrease in performance when the model is tested on target data. In this thesis, we investigate the direction of aligning the class-conditional distributions instead of the marginal ones with various strategies of adversarial training. Specifically, we emphasize the discriminative properties of the extracted features from the target data. To achieve this goal, we propose several UDA methods that either constrain the aligned features from the target domain to preserve useful properties for generalization, such as smoothness and compactness, or introduce a generative model to mutually reinforce discriminative learning on target data. We resort to various strategies of adversarial training, which are employed at either the pixel level or the feature level. The list of proposed works includes:
- The conditional generative model `KTransGAN' is introduced to tackle the problem of domain shift by mutually promoting the performance with a discriminative model. The generative model uses adversarial training and variational inference to approximate the target distribution conditioned on the class labels. Due to the missing labels in target data, the generator uses classifiers' predictions as pseudo-labels. The classifier aims to improve its accuracy on target data with the help of source samples in addition to other samples synthesized by the generator. Feature matching is also used to stabilize training. From the empirical results, the proposed model exhibits outstanding performance in both synthesis quality and classification performance. The experiments are conducted on a number of synthetic datasets and multiple real-world benchmarks.
- An adversarially constrained interpolation technique is proposed to overcome limitations in domain adversarial training. Domain adversarial training has been introduced to learn domain indistinguishable features from source and target data. Meanwhile, the classifier is trained on source data in a supervised mode. However, this approach is subject to two challenges: samples from two domains are insufficient to guarantee domain-invariance at most part of the latent space, and neighboring samples from the target domain may not belong to the same class on the low-dimensional manifold. We propose two strategies to mitigate these drawbacks. First, we incorporate a domain mixup strategy in a domain adversarial learning model by performing linearly interpolation between source and target samples. This allows the latent space to be continuous and yields improvement in domain matching. Second, the domain discriminator is regularized via judging the relative difference between both domains for the input mixup features, which accelerates the domain matching. Experimental results show that our proposed model achieves a superior performance on different tasks under various domain shifts and data complexity.
- Another strategy towards tackling the UDA problem is to perform adversarial training between two classifiers and their shared feature extractor. The two classifiers are enforced to detect the misaligned target regions from the source domain, while the feature extractor aligns these features by confusing the classifiers. Although this method yields improvement, it ignores the relationship among target neighbors, which may consequently limit the model performance. In this work, we propose a new alignment strategy based on the ``cluster assumption'' to ensure the aligned target features preserve their clusters by avoiding overlap with decision boundaries. Furthermore, to make the aligned features more compact, we constrain them to be robust against adversarial perturbations using different views of the classifiers. Extensive experiments demonstrate the effectiveness of our solution on various datasets.
- An adversarially smoothed feature alignment (AdvSFA) framework is also proposed to identify ambiguous target inputs by maximizing the classifiers' discrepancy in an extended class space. This enables the generator to receive valuable feedback from the classifiers and consequently learn more discriminative and smoother representation. Imposing smoothness on the latent manifold is a desirable property to improve model generalization and avoid having neighboring samples of different classes. To further promote such properties, we not only task the generator to conduct feature alignment on target examples but also in-between them. By adopting these constraints, our method shows a remarkable improvement across different adaptation tasks using two benchmark datasets.
To alleviate the domain shift problem, previous works have focused on aligning the marginal distribution of both domains while ensuring the capability of the trained model to show highly discriminative performance on source data. Distance metrics and adversarial training are the prevalent techniques used for conducting domain alignment. However, marginal distribution alignment is insufficient to preclude the decrease in performance when the model is tested on target data. In this thesis, we investigate the direction of aligning the class-conditional distributions instead of the marginal ones with various strategies of adversarial training. Specifically, we emphasize the discriminative properties of the extracted features from the target data. To achieve this goal, we propose several UDA methods that either constrain the aligned features from the target domain to preserve useful properties for generalization, such as smoothness and compactness, or introduce a generative model to mutually reinforce discriminative learning on target data. We resort to various strategies of adversarial training, which are employed at either the pixel level or the feature level. The list of proposed works includes:
- The conditional generative model `KTransGAN' is introduced to tackle the problem of domain shift by mutually promoting the performance with a discriminative model. The generative model uses adversarial training and variational inference to approximate the target distribution conditioned on the class labels. Due to the missing labels in target data, the generator uses classifiers' predictions as pseudo-labels. The classifier aims to improve its accuracy on target data with the help of source samples in addition to other samples synthesized by the generator. Feature matching is also used to stabilize training. From the empirical results, the proposed model exhibits outstanding performance in both synthesis quality and classification performance. The experiments are conducted on a number of synthetic datasets and multiple real-world benchmarks.
- An adversarially constrained interpolation technique is proposed to overcome limitations in domain adversarial training. Domain adversarial training has been introduced to learn domain indistinguishable features from source and target data. Meanwhile, the classifier is trained on source data in a supervised mode. However, this approach is subject to two challenges: samples from two domains are insufficient to guarantee domain-invariance at most part of the latent space, and neighboring samples from the target domain may not belong to the same class on the low-dimensional manifold. We propose two strategies to mitigate these drawbacks. First, we incorporate a domain mixup strategy in a domain adversarial learning model by performing linearly interpolation between source and target samples. This allows the latent space to be continuous and yields improvement in domain matching. Second, the domain discriminator is regularized via judging the relative difference between both domains for the input mixup features, which accelerates the domain matching. Experimental results show that our proposed model achieves a superior performance on different tasks under various domain shifts and data complexity.
- Another strategy towards tackling the UDA problem is to perform adversarial training between two classifiers and their shared feature extractor. The two classifiers are enforced to detect the misaligned target regions from the source domain, while the feature extractor aligns these features by confusing the classifiers. Although this method yields improvement, it ignores the relationship among target neighbors, which may consequently limit the model performance. In this work, we propose a new alignment strategy based on the ``cluster assumption'' to ensure the aligned target features preserve their clusters by avoiding overlap with decision boundaries. Furthermore, to make the aligned features more compact, we constrain them to be robust against adversarial perturbations using different views of the classifiers. Extensive experiments demonstrate the effectiveness of our solution on various datasets.
- An adversarially smoothed feature alignment (AdvSFA) framework is also proposed to identify ambiguous target inputs by maximizing the classifiers' discrepancy in an extended class space. This enables the generator to receive valuable feedback from the classifiers and consequently learn more discriminative and smoother representation. Imposing smoothness on the latent manifold is a desirable property to improve model generalization and avoid having neighboring samples of different classes. To further promote such properties, we not only task the generator to conduct feature alignment on target examples but also in-between them. By adopting these constraints, our method shows a remarkable improvement across different adaptation tasks using two benchmark datasets.