Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification

Aojun Zhou, Ke Wang, Zimu Lu, Weikang Shi, Sichun Luo, Zipeng Qin, Shaoqing Lu, Anya Jia, Linqi Song, Mingjie Zhan*, Hongsheng Li*

*Corresponding author for this work

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

22 Citations (Scopus)

Abstract

Recent progress in large language models (LLMs) like GPT-4 and PaLM-2 has brought significant advancements in solving math problems. In particular, OpenAI's latest version of GPT-4, known as GPT-4 Code Interpreter, shows remarkable performance on challenging math datasets. In this paper, we explore the effect of code on enhancing LLMs' reasoning capability by introducing different constraints on the Code Usage Frequency of GPT-4 Code Interpreter. We found that its success can be primarily attributed to its powerful skills in generating and executing code, evaluating the execution result, and rectifying its solution when receiving unreasonable outputs. Based on this, we propose a novel prompting method, explicit code-based self-verification (CSV). This method employs a zero-shot prompt on the GPT-4 Code Interpreter to encourage it to use code to self-verify its answers. In instances where the verification state is "False", the model will automatically amend its solution. Furthermore, we recognize that the states of the verification result indicate the confidence of a solution, which can improve the effectiveness of majority voting. With GPT-4 Code Interpreter and CSV, we achieve an impressive zero-shot accuracy of various mathematical problem-solving benchmarks. © 2024 12th International Conference on Learning Representations, ICLR 2024. All rights reserved.
Original languageEnglish
Title of host publicationThe Twelfth International Conference on Learning Representations, ICLR 2024
PublisherInternational Conference on Learning Representations, ICLR
Number of pages27
Publication statusPublished - May 2024
Event12th International Conference on Learning Representations (ICLR 2024) - Messe Wien Exhibition and Congress Center, Vienna, Austria
Duration: 7 May 202411 May 2024
https://iclr.cc/Conferences/2024
https://openreview.net/group?id=ICLR.cc/2024/Conference

Publication series

NameInternational Conference on Learning Representations, ICLR

Conference

Conference12th International Conference on Learning Representations (ICLR 2024)
PlaceAustria
CityVienna
Period7/05/2411/05/24
Internet address

Bibliographical note

Research Unit(s) information for this publication is provided by the author(s) concerned.

Funding

This project is funded in part by National Key R&D Program of China Project 2022ZD0161100, and in part by General Research Fund of Hong Kong RGC Project 14204021.

RGC Funding Information

  • RGC-funded

Fingerprint

Dive into the research topics of 'Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification'. Together they form a unique fingerprint.

Cite this