Abstract
The task of repository-level code completion is to continue writing the unfinished code based on a broader context of the repository. While for automated code completion tools, it is difficult to utilize the useful information scattered in different files. We propose RepoCoder, a simple, generic, and effective framework to address the challenge. It streamlines the repository-level code completion process by incorporating a similarity-based retriever and a pre-trained code language model in an iterative retrieval-generation pipeline. RepoCoder makes effective utilization of repository-level information for code completion and has the ability to generate code at various levels of granularity. Moreover, we propose a new benchmark RepoBench, which consists of the latest and high-quality real-world repositories covering line, API invocation, and function body completion scenarios. Experimental results indicate that RepoCoder significantly improves the In-File completion baseline by over 10% in all settings and consistently outperforms the vanilla retrieval-augmented code completion approach. Furthermore, we validate the effectiveness of RepoCoder through comprehensive analysis, providing valuable insights for future research. ©2023 Association for Computational Linguistics.
| Original language | English |
|---|---|
| Title of host publication | Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing |
| Editors | Houda Bouamor, Juan Pino, Kalika Bali |
| Publisher | Association for Computational Linguistics |
| Pages | 2471–2484 |
| ISBN (Print) | 9798891760608 |
| DOIs | |
| Publication status | Published - Dec 2023 |
| Event | 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023) - Resorts World Convention Centre (Hybrid), Singapore Duration: 6 Dec 2023 → 10 Dec 2023 https://aclanthology.org/2023.emnlp-main https://2023.emnlp.org/ |
Publication series
| Name | EMNLP - Conference on Empirical Methods in Natural Language Processing, Proceedings |
|---|
Conference
| Conference | 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023) |
|---|---|
| Abbreviated title | EMNLP |
| Place | Singapore |
| Period | 6/12/23 → 10/12/23 |
| Internet address |
Bibliographical note
Research Unit(s) information for this publication is provided by the author(s) concerned.Publisher's Copyright Statement
- This full text is made available under CC-BY 4.0. https://creativecommons.org/licenses/by/4.0/