On Robustness of Deep Learning: Adversarial Vulnerability, Distribution Shift and Training Stability


Student thesis: Doctoral Thesis

View graph of relations


Related Research Unit(s)


Awarding Institution
Award date6 Jan 2023


Deep learning has thrived for the last decade as a result of its efficacy in addressing numerous problems in computer visions, natural language processing and many other fields. However, the robustness of deep learning has been a main obstacle of its deployment in real-world applications. First, the adversarial attack has been shown to drastically deteriorate the performance of deep neural networks. Second, when the distribution at test time is different from the training distribution, the generalization performance is often worsened. Finally, the training of deep neural networks is heavily sensitive to hyperpameters and in regression task the result is sensitive to random initialization. In this thesis, the above three aspects of robustness in deep learning are investigated.

Using weight decay to penalize the L2 norms of weights in neural networks has been a standard training practice to regularize the complexity of networks. We show that a family of regularizers, including weight decay, is ineffective at penalizing the intrinsic norms of weights for networks with positively homogeneous activation functions, such as linear, ReLU and max-pooling functions. To address this shortcoming, we propose an improved regularizer that is invariant to weight scale shifting and thus effectively constrains the intrinsic norm of a neural network. The derived regularizer is an upper bound for the input gradient of the network so minimizing the improved regularizer also benefits the adversarial robustness. We demonstrate the efficacy of our proposed regularizer on various datasets and neural network architectures at improving generalization and adversarial robustness. We also investigates the scale-variant property of cross-entropy loss, which is the most commonly used loss function in classification tasks, and its impact on the effective margin and adversarial robustness of deep neural networks. Since the loss function is not invariant to logit scaling, increasing the effective weight norm will make the loss approach zero and its gradient vanish while the effective margin is not adequately maximized. Our empirical study on feedforward DNNs demonstrates that the proposed effective margin regularization (EMR) learns large effective margins and boosts the adversarial robustness in both standard and adversarial training. On large-scale models, we show that EMR outperforms vanilla adversarial training and two regularization baselines with substantial improvement.

The performance of machine learning models under distribution shift has been the focus of the community in recent years. This thesis studies the distribution shift problem from the perspective of pre-training and data augmentation, two important factors in the practice of deep learning that have not been systematically investigated by existing work. With our empirical result obtained from 1,330 models, we provide the following main observations: 1) ERM combined with data augmentation can achieve state-of-the-art performance if we choose a proper pre-trained model respecting the data property; 2) specialized algorithms further improve the robustness on top of ERM when handling a specific type of distribution shift, e.g., GroupDRO for spurious correlation and CORAL for large-scale out-of-distribution data; 3) Comparing different pre-training modes, architectures and data sizes, we provide novel observations about pre-training on distribution shift, which sheds light on designing or selecting pre-training strategy for different kinds of distribution shifts. Next, we investigate how to improve the fine-tuning for small data sets. Our theoretical analysis shows that the excess risk bound on a target task can be improved when the appropriate pre-training data is included in fine-tuning. With the theoretical motivation, we propose a novel selection strategy to select a subset from pre-training data to help improve the generalization on the target task especially when the downstream data is scarce or has a long-tail label distribution. Extensive experimental results for image classification tasks on 8 benchmark data sets verify the effectiveness of the proposed data selection based fine-tuning pipeline.

In the training practice of deep neural networks with batch normalization (BN-DNNs) weight decay (WD) is often used to ensure good generalization, where some convolution layers are invariant to weight rescaling due to the normalization. We demonstrate that the gradients norm during training with WD are quite unstable, leading to an overfitting effect, and the performance of WD is very sensitive to the hyperparameter. To address those weaknesses, we propose to regularize the weight norm using a simple yet effective weight rescaling (WRS) scheme as an alternative to weight decay. WRS controls the weight norm by explicitly rescaling it to the unit norm, which prevents a large increase to the gradient but also ensures a sufficiently large effective learning rate to improve generalization. On a variety of computer vision applications including image classification, object detection, semantic segmentation and crowd counting, we show the effectiveness and robustness of WRS compared with weight decay, implicit weight rescaling and gradient projection. When training deep neural networks in regression, the training is sensitive to initialization and the learning curve fluctuates. To address these two issues, we propose Adaptive Momentum Weight Averaging (AMWA) to smoothen the loss surface and stabilize the training process of DNNs. We show that the proposed method decreases the variance during training and improves the robustness to initialization on a wide variety of architectures and regression tasks.

    Research areas

  • Adversarial Vulnerability, Distribution Shift, Training Stability