Abstract
Large language models (LLMs) demonstrate significant capabilities in many natural language processing tasks. However, their performance in machine translation is still behind that of the models specially trained for machine translation with an encoder-decoder architecture. This paper investigates how to improve neural machine translation (NMT) with LLMs. Our proposal is based on an empirical insight that NMT gets worse fluency than human translation. We propose to use LLMs to enhance the fluency of NMT’s generation by integrating a language model at the target side. We use contrastive learning to constrain fluency so that it does not exceed the LLMs’ fluency. Our experiments on three language pairs show that this method can improve the performance of NMT. Our empirical analysis further demonstrates that this method improves the fluency on the target side. Our experiments also show that some straightforward post-processing methods using LLMs, such as re-ranking and refinement, are not effective. © 2025 The authors.
| Original language | English |
|---|---|
| Title of host publication | Proceedings of Machine Translation Summit XX |
| Subtitle of host publication | Volume 1 |
| Editors | Pierrette Bouillon, Johanna Gerlach, Sabrina Girletti, Lise Volkart, Raphael Rubino, Rico Sennrich, Ana C. Farinha, Marco Gaido, Joke Daems, Dorothy Kenny, Helena Moniz, Sara Szoc |
| Place of Publication | Switzerland |
| Publisher | European Association for Machine Translation |
| Pages | 54-64 |
| ISBN (Print) | 978-2-9701897-0-1 |
| Publication status | Published - Jun 2025 |
| Event | 20th Machine Translation Summit, MTSummit 2025 - Uni Mail, Geneva, Switzerland Duration: 23 Jun 2025 → 27 Jun 2025 https://mtsummit2025.unige.ch/index.html |
Publication series
| Name | MT Summit - Proceedings of Machine Translation Summit |
|---|
Conference
| Conference | 20th Machine Translation Summit, MTSummit 2025 |
|---|---|
| Abbreviated title | MTSummit 2025 |
| Place | Switzerland |
| City | Geneva |
| Period | 23/06/25 → 27/06/25 |
| Internet address |
Bibliographical note
Research Unit(s) information for this publication is provided by the author(s) concerned.Publisher's Copyright Statement
- This full text is made available under CC-BY-ND 4.0. https://creativecommons.org/licenses/by-nd/4.0/
Fingerprint
Dive into the research topics of 'Improve Fluency Of Neural Machine Translation Using Large Language Models'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver