Aligning Crowd-sourced Human Feedback for Code Generation with Bayesian Inference

Man Fai Wong*, Chee Wei Tan

*Corresponding author for this work

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

1 Citation (Scopus)

Abstract

While large language models (LLMs) excel at code generation, translating abstract descriptions into robust and functional code remains a significant challenge. Despite dedicated efforts, existing works for refining code generation with LLMs have demonstrated limitations, either constrained by the static rules or computational overhead of additional training, ultimately proving insufficient to meet the intricate demands of real-world code quality. This paper proposes a method to improve code generation ability with LLM by combining reinforcement learning from human feedback (RLHF) with crowd-sourced computation, referred to as cRLHF. Our goal is to enhance code quality through diverse end-user feedback. Traditional RLHF, relying on a single evaluator, risks biases and overlooks insights, hampering LLMs' growth. The cRLHF framework, powered by Bayesian inference, ensures objective code evaluation from multiple evaluators. Our experiments exhibit significant improvements in code correctness, showcasing the efficacy of crowd-sourcing with reinforcement learning. © 2024 IEEE.
Original languageEnglish
Title of host publicationProceedings - 2024 IEEE Conference on Artificial Intelligence
Subtitle of host publicationCAI 2024
PublisherIEEE
Pages158-163
ISBN (Electronic)979-8-3503-5409-6
ISBN (Print)979-8-3503-5410-2
DOIs
Publication statusPublished - 2024
Event2nd IEEE Conference on Artificial Intelligence (IEEE CAI 2024) - Marina Bay Sands, Singapore
Duration: 25 Jun 202427 Jun 2024
https://ieeecai.org/2024/

Publication series

NameProceedings - IEEE Conference on Artificial Intelligence, CAI

Conference

Conference2nd IEEE Conference on Artificial Intelligence (IEEE CAI 2024)
Abbreviated titleCAI 2024
PlaceSingapore
Period25/06/2427/06/24
Internet address

Bibliographical note

Research Unit(s) information for this publication is provided by the author(s) concerned.

Research Keywords

  • AI alignment
  • Bayesian analysis
  • Code generation
  • Inductive bias
  • Reinforcement learning

Fingerprint

Dive into the research topics of 'Aligning Crowd-sourced Human Feedback for Code Generation with Bayesian Inference'. Together they form a unique fingerprint.

Cite this