Abstract
Current logical reasoning evaluations of Large Language Models (LLMs) primarily focus on single-turn and static environments, such as arithmetic problems. The crucial problem of multi-turn, strategic reasoning is under-explored. In this work, we analyze the multi-turn strategic reasoning of LLMs through text-driven complete- and incomplete-information gaming, e.g., board games (Tic-Tac-Toe, Connect-4) and poker games (Texas Hold’em Poker). Specifically, we consider two distinct scenarios: 1) Online Racing, featuring multiple LLMs/agents to facilitate direct competition and comparison; 2) Offline Probing, constructing targeted questions with verified ground truth to evaluate LLMs’ strategic behaviors. Experimental results demonstrate that existing state-of-the-art LLMs and reasoning schemes are largely ineffective for strategic reasoning tasks. To mitigate these limitations, we propose a simple yet effective Recursively Thinking-Ahead (ReTA) agent, incorporating a recursive prompting mechanism that automatically analyzes the opponents’ future moves/actions and assigns reward signals for these situations, to strengthen the strategic reasoning of LLMs. We hope our work could spur further research and exploration in the multi-turn strategic reasoning of LLMs. The code is available at https://github.com/jinhaoduan/ReTA. © 2024 Association for Computational Linguistics.
| Original language | English |
|---|---|
| Title of host publication | Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies |
| Publisher | Association for Computational Linguistics |
| Pages | 2232-2246 |
| Volume | 1 |
| ISBN (Print) | 9798891761148 |
| DOIs | |
| Publication status | Published - Jun 2024 |
| Externally published | Yes |
| Event | 2024 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2024) - Hybrid, Mexico City, Mexico Duration: 16 Jun 2024 → 21 Jun 2024 https://aclanthology.org/2024.naacl-long |
Publication series
| Name | Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL |
|---|
Conference
| Conference | 2024 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2024) |
|---|---|
| Place | Mexico |
| City | Mexico City |
| Period | 16/06/24 → 21/06/24 |
| Internet address |
Funding
This work was performed under the auspices of the U.S. Department of Energy by the Lawrence Livermore National Laboratory under Contract No. DE-AC52-07NA27344 and was supported by the LLNL LDRD Program under Project No. 23-ERD-030. This work was partially supported by NSF No. 2319242.
Publisher's Copyright Statement
- This full text is made available under CC-BY 4.0. https://creativecommons.org/licenses/by/4.0/
Fingerprint
Dive into the research topics of 'ReTA: Recursively Thinking Ahead to Improve the Strategic Reasoning of Large Language Models'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver