Recovery Should Never Deviate from Ground Truth: Mitigating Exposure Bias in Neural Machine Translation

Jianfei He, Shichao Sun, Xiaohua Jia, Wenjie Li

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

1 Citation (Scopus)
6 Downloads (CityUHK Scholars)

Abstract

In Neural Machine Translation, models are often trained with teacher forcing and suffer from exposure bias due to the discrepancy between training and inference. Current token-level solutions, such as scheduled sampling, aim to maximize the model's capability to recover from errors. Their loss functions have a side effect: a sequence with errors may have a larger probability than the ground truth. The consequence is that the generated sequences may deviate from the ground truth. This side effect is verified in our experiments. To address this issue, we propose using token-level contrastive learning to coordinate three training objectives: the usual MLE objective, an objective for recovery from errors, and a new objective to explicitly constrain the recovery in a scope that does not impact the ground truth. Our empirical analysis shows that this method effectively achieves these objectives in training and reduces the frequency with which the third objective is violated. Experiments on three language pairs (German-English, Russian-English, and English-Russian) show that our method outperforms the vanilla Transformer and other methods addressing the exposure bias. © 2024 The authors, © 2024 European Association for Machine Translation.
Original languageEnglish
Title of host publicationProceedings of the 25th Annual Conference of the European Association for Machine Translation
EditorsCarolina Scarton, Charlotte Prescott, Chris Bayliss, Chris Oakley, Joanna Wright, Stuart Wrigley, Xingyi Song, Edward Gow-Smith, Rachel Bawden, Víctor M Sánchez-Cartagena, Patrick Cadwell, Ekaterina Lapshinova-Koltunski, Vera Cabarrão, Konstantinos Chatzitheodorou, Mary Nurminen, Diptesh Kanojia, Helena Moniz
PublisherEuropean Association for Machine Translation
Pages68-79
Volume1: Research And Implementations & Case Studies
ISBN (Print)978-1-0686907-0-9
Publication statusPublished - Jun 2024
Event25th Annual Conference of the European Association for Machine Translation (EAMT 2024) - University of Sheffield, Sheffield, United Kingdom
Duration: 24 Jun 202427 Jun 2024
https://eamt2024.sheffield.ac.uk/

Publication series

NameProceedings of the Annual Conference of the European Association for Machine Translation, EAMT

Conference

Conference25th Annual Conference of the European Association for Machine Translation (EAMT 2024)
Country/TerritoryUnited Kingdom
CitySheffield
Period24/06/2427/06/24
Internet address

Bibliographical note

Research Unit(s) information for this publication is provided by the author(s) concerned.

Publisher's Copyright Statement

  • This full text is made available under CC-BY-ND 4.0. https://creativecommons.org/licenses/by-nd/4.0/

Fingerprint

Dive into the research topics of 'Recovery Should Never Deviate from Ground Truth: Mitigating Exposure Bias in Neural Machine Translation'. Together they form a unique fingerprint.

Cite this