A Thorough Examination of Decoding Methods in the Era of LLMs

Chufan Shi (Co-first Author), Haoran Yang (Co-first Author), Deng Cai*, Zhisong Zhang, Yifan Wang, Yujiu Yang*, Wai Lam

*Corresponding author for this work

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

6 Citations (Scopus)
4 Downloads (CityUHK Scholars)

Abstract

Decoding methods play an indispensable role in converting language models from next-token predictors into practical task solvers. Prior research on decoding methods, primarily focusing on task-specific models, may not extend to the current era of general-purpose large language models (LLMs). Moreover, the recent influx of decoding strategies has further complicated this landscape. This paper provides a comprehensive and multifaceted analysis of various decoding methods within the context of LLMs, evaluating their performance, robustness to hyperparameter changes, and decoding speeds across a wide range of tasks, models, and deployment environments. Our findings reveal that decoding method performance is notably task-dependent and influenced by factors such as alignment, model size, and quantization. Intriguingly, sensitivity analysis exposes that certain methods achieve superior performance at the cost of extensive hyperparameter tuning, highlighting the trade-off between attaining optimal results and the practicality of implementation in varying contexts. © 2024 Association for Computational Linguistics.
Original languageEnglish
Title of host publicationEMNLP 2024 - The 2024 Conference on Empirical Methods in Natural Language Processing
Subtitle of host publicationProceedings of the Conference
PublisherACL Anthology
Pages8601-8629
Number of pages29
ISBN (Print)9798891761643
DOIs
Publication statusPublished - Nov 2024
Externally publishedYes
Event29th Conference on Empirical Methods in Natural Language Processing (EMNLP 2024) - Hybrid, Miami, United States
Duration: 12 Nov 202416 Nov 2024
https://2024.emnlp.org/

Publication series

NameEMNLP - Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference

Conference

Conference29th Conference on Empirical Methods in Natural Language Processing (EMNLP 2024)
Abbreviated titleEMNLP 2024
PlaceUnited States
CityMiami
Period12/11/2416/11/24
Internet address

Funding

This research is partly supported by the Shenzhen Science and Technology Program (JCYJ20220818101014030) and the \"Graph Neural Network Project\" of Ping An Technology (Shenzhen) Co., Ltd. Additionally, the work described in this paper is substantially funded by a grant from the Research Grant Council of the Hong Kong Special Administrative Region, China (Project Code: 14200620).

Publisher's Copyright Statement

  • This full text is made available under CC-BY 4.0. https://creativecommons.org/licenses/by/4.0/

RGC Funding Information

  • RGC-funded

Fingerprint

Dive into the research topics of 'A Thorough Examination of Decoding Methods in the Era of LLMs'. Together they form a unique fingerprint.

Cite this