An Empirical Study on Distribution Shift Robustness From the Perspective of Pre-Training and Data Augmentation
Research output: Conference Papers › RGC 32 - Refereed conference paper (without host publication) › peer-review
Author(s)
Related Research Unit(s)
Detail(s)
Original language | English |
---|---|
Publication status | Published - Nov 2022 |
Conference
Title | 36th Conference on Neural Information Processing Systems (NeurIPS 2022) |
---|---|
Location | Hybrid, New Orleans Convention Center |
Place | United States |
City | New Orleans |
Period | 28 November - 9 December 2022 |
Link(s)
Document Link | |
---|---|
Permanent Link | https://scholars.cityu.edu.hk/en/publications/publication(27d14062-aea3-4b0d-aafc-5c52d26b13a8).html |
Abstract
The performance of machine learning models under distribution shift has been the focus of
the community in recent years. Most of current methods have been proposed to improve
the robustness to distribution shift from the algorithmic perspective, i.e., designing better
training algorithms to help the generalization in shifted test distributions. This paper studies
the distribution shift problem from the perspective of pre-training and data augmentation,
two important factors in the practice of deep learning that have not been systematically
investigated by existing work. By evaluating seven pre-trained models, including ResNets
[1] and ViT’s [2] with self-supervision and supervision mode, on five important distribution-shift datasets, from WILDS [3] and DomainBed [4] benchmarks, with five different learning
algorithms, we provide the first comprehensive empirical study focusing on pre-training
and data augmentation. With our empirical result obtained from 1,330 models, we provide
the following main observations: 1) ERM combined with data augmentation can achieve
state-of-the-art performance if we choose a proper pre-trained model respecting the data
property; 2) specialized algorithms further improve the robustness on top of ERM when
handling a specific type of distribution shift, e.g., GroupDRO [5] for spurious correlation
and CORAL [6] for large-scale out-of-distribution data; 3) Comparing different pre-training
modes, architectures and data sizes, we provide novel observations about pre-training
on distribution shift, which sheds light on designing or selecting pre-training strategy
for different kinds of distribution shifts. In summary, our empirical study provides a
comprehensive baseline for a wide range of pre-training models fine-tuned with data
augmentation, which potentially inspires research exploiting the power of pre-training and
data augmentation in the future of distribution shift study.
Citation Format(s)
An Empirical Study on Distribution Shift Robustness From the Perspective of Pre-Training and Data Augmentation. / Liu, Ziquan; Xu, Yi; Xu, Yuanhong et al.
2022. Paper presented at 36th Conference on Neural Information Processing Systems (NeurIPS 2022), New Orleans, Louisiana, United States.
2022. Paper presented at 36th Conference on Neural Information Processing Systems (NeurIPS 2022), New Orleans, Louisiana, United States.
Research output: Conference Papers › RGC 32 - Refereed conference paper (without host publication) › peer-review