Abstract
It has been a common approach to pre-train a language model on a large corpus and fine-tune it on task-specific data. In practice, we observe that fine-tuning a pre-trained model on a small dataset may lead to over- and/or under-estimation problem. In this paper, we propose MC-Tailor, a novel method to alleviate the above issue in text generation tasks by truncating and transferring the probability mass from over-estimated regions to underestimated ones. Experiments on a variety of text generation datasets show that MC-Tailor consistently and significantly outperforms the fine-tuning approach. Our code is available at https://github.com/NingMiao/MC-tailor. © 2020 Association for Computational Linguistics.
Original language | English |
---|---|
Title of host publication | ACL 2020 - The 58th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference |
Editors | Dan Jurafsky, Joyce Chai, Natalie Schluter, Joel Tetreault |
Place of Publication | Stroudsburg, PA |
Publisher | Association for Computational Linguistics |
Pages | 3436-3441 |
ISBN (Print) | 9781952148255 |
DOIs | |
Publication status | Published - Jul 2020 |
Externally published | Yes |
Event | 58th Annual Meeting of the Association for Computational Linguistics (ACL 2020) - Virtual, United States Duration: 5 Jul 2020 → 10 Jul 2020 https://acl2020.org/ |
Publication series
Name | Proceedings of the Annual Meeting of the Association for Computational Linguistics |
---|---|
ISSN (Print) | 0736-587X |
Conference
Conference | 58th Annual Meeting of the Association for Computational Linguistics (ACL 2020) |
---|---|
Abbreviated title | ACL2020 |
Country/Territory | United States |
Period | 5/07/20 → 10/07/20 |
Internet address |