Unleashing the Power of Meta-tuning for Few-shot Generalization Through Sparse Interpolated Experts
Research output: Chapters, Conference Papers, Creative and Literary Works › RGC 32 - Refereed conference paper (with host publication) › peer-review
Author(s)
Related Research Unit(s)
Detail(s)
Original language | English |
---|---|
Title of host publication | Proceedings of the 41st International Conference on Machine Learning |
Pages | 7280-7297 |
Publication status | Published - Jul 2024 |
Publication series
Name | Proceedings of Machine Learning Research |
---|---|
Volume | 235 |
ISSN (Print) | 2640-3498 |
Conference
Title | 41st International Conference on Machine Learning (ICML 2024) |
---|---|
Location | Messe Wien Exhibition Congress Center |
Place | Austria |
City | Vienna |
Period | 21 - 27 July 2024 |
Link(s)
Abstract
Recent successes suggest that parameter-efficient fine-tuning of foundation models is becoming the state-of-the-art method for transfer learning in vision, gradually replacing the rich literature of alternatives such as meta-learning. In trying to harness the best of both worlds, meta-tuning introduces a subsequent optimization stage of foundation models but has so far only shown limited success and crucially tends to underperform on out-of-distribution (OOD) tasks. In this paper, we introduce Sparse MetA-Tuning (SMAT), a method inspired by sparse mixture-of-experts approaches and trained to isolate subsets of pre-trained parameters automatically for meta-tuning on each task. SMAT successfully overcomes OOD sensitivity and delivers on the promise of enhancing the transfer abilities of vision foundation models beyond parameter-efficient finetuning. We establish new state-of-the-art results on a challenging combination of Meta-Dataset augmented with additional OOD tasks in both zero-shot and gradient-based adaptation settings. In addition, we provide a thorough analysis of the superiority of learned over hand-designed sparsity patterns for sparse expert methods and the pivotal importance of the sparsity level in balancing between in-distribution and out-of-distribution generalization. Our code and models are publicly available. Copyright 2024 by the author(s)
Bibliographic Note
Research Unit(s) information for this publication is provided by the author(s) concerned.
Citation Format(s)
Unleashing the Power of Meta-tuning for Few-shot Generalization Through Sparse Interpolated Experts. / Chen, Shengzhuang; Tack, Jihoon; Yang, Yunqiao et al.
Proceedings of the 41st International Conference on Machine Learning. 2024. p. 7280-7297 (Proceedings of Machine Learning Research; Vol. 235).
Proceedings of the 41st International Conference on Machine Learning. 2024. p. 7280-7297 (Proceedings of Machine Learning Research; Vol. 235).
Research output: Chapters, Conference Papers, Creative and Literary Works › RGC 32 - Refereed conference paper (with host publication) › peer-review