DistillCSE: Distilled Contrastive Learning for Sentence Embeddings

Jiahao Xu, Wei Shao, Lihui Chen, Lemao Liu

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

6 Citations (Scopus)
46 Downloads (CityUHK Scholars)

Abstract

This paper proposes the DistillCSE framework, which performs contrastive learning under the self-training paradigm with knowledge distillation. The potential advantage of DistillCSE is its self-enhancing feature: using a base model to provide additional supervision signals, a stronger model may be learned through knowledge distillation. However, the vanilla DistillCSE through the standard implementation of knowledge distillation only achieves marginal improvements due to severe overfitting. The further quantitative analyses demonstrate the reason that the standard knowledge distillation exhibits a relatively large variance of the teacher model's logits due to the essence of contrastive learning. To mitigate the issue induced by high variance, this paper accordingly proposed two simple yet effective solutions for knowledge distillation: a Group-P shuffling strategy as an implicit regularization and the averaging logits from multiple teacher components. Experiments on standard benchmarks demonstrate that the proposed DistillCSE outperforms many strong baseline methods and yields a new state-of-the-art performance.
Original languageEnglish
Title of host publicationFindings of the Association for Computational Linguistics
Subtitle of host publicationEMNLP 2023
PublisherAssociation for Computational Linguistics
Pages8153-8165
Number of pages13
ISBN (Electronic)9798891760615
DOIs
Publication statusPublished - Dec 2023
Event2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023) - Resorts World Convention Centre (Hybrid), Singapore
Duration: 6 Dec 202310 Dec 2023
https://aclanthology.org/2023.emnlp-main
https://2023.emnlp.org/

Conference

Conference2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023)
Abbreviated titleEMNLP
PlaceSingapore
Period6/12/2310/12/23
Internet address

Bibliographical note

Research Unit(s) information for this publication is provided by the author(s) concerned.

Publisher's Copyright Statement

  • This full text is made available under CC-BY 4.0. https://creativecommons.org/licenses/by/4.0/

Fingerprint

Dive into the research topics of 'DistillCSE: Distilled Contrastive Learning for Sentence Embeddings'. Together they form a unique fingerprint.

Cite this