Large language model ChatGPT versus small deep learning models for self-admitted technical debt detection : Why not together?

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

View graph of relations

Author(s)

  • Jun Li
  • Lixian Li
  • Jin Liu
  • Xiao Yu
  • Xiao Liu

Related Research Unit(s)

Detail(s)

Original languageEnglish
Journal / PublicationSoftware - Practice and Experience
Online published28 Jun 2024
Publication statusOnline published - 28 Jun 2024

Abstract

Given the increasing complexity and volume of Self-Admitted Technical Debts (SATDs), how to efficiently detect them becomes critical in software engineering practice for improving code quality and project efficiency. Although current deep learning methods have achieved good performance in detecting SATDs in code comments, they lack explanation. Large language models such as ChatGPT are increasingly being applied to text classification tasks due to their ability to provide explanations for classification results, but it is unclear how effective ChatGPT is for SATD classification. As the first in-depth study of ChatGPT for SATD detection, we evaluate ChatGPT's effectiveness, compare it with small deep learning models, and find that ChatGPT performs better on Recall, while small models perform better on Precision. Furthermore, to enhance the performance of these approaches, we propose a novel fusion approach named FSATD which combines ChatGPT with small models for SATD detection so as to provide reliable explanations. Through extensive experiments on 62,276 comments from 10 open-source projects, we show that FSATD outperforms existing methods in performance of F1-score in cross-project scenarios. Additionally, FSATD allows for flexible adjustment of fusion strategies, adapting to different requirements of various application scenarios, and can achieve the best Precision, Recall, or F1-score. © 2024 John Wiley & Sons Ltd.

Research Area(s)

  • ChatGPT, fusion, performance and interpretability, self-admitted technical debt, small deep learning models