Machine translation-based bug localization technique for bridging lexical gap
Related Research Unit(s)
|Journal / Publication||Information and Software Technology|
|Early online date||6 Mar 2018|
|State||Published - Jul 2018|
|Link to Scopus||https://www.scopus.com/record/display.uri?eid=2-s2.0-85043476087&origin=recordpage|
Objective: To bridge the lexical gap and improve the effectiveness of localizing buggy files by leveraging the extracted semantic information from bug reports and source code.
Method: We present BugTranslator, a novel deep learning-based machine translation technique composed of an attention-based recurrent neural network (RNN) Encoder-Decoder with long short-term memory cells. One RNN encodes bug reports into several context vectors that are decoded by another RNN into code tokens of buggy files. The technique studies and adopts the relevance between the extracted semantic information from bug reports and source files.
Results: The experimental results show that BugTranslator outperforms a current state-of-the-art word embedding technique on three open-source projects with higher MAP and MRR. The results show that BugTranslator can rank actual buggy files at the second or third places on average.
Conclusion: BugTranslator distinguishes bug reports and source code into different symbolic classes and then extracts deep semantic similarity and relevance between bug reports and the corresponding buggy files to bridge the lexical gap at its source, thereby further improving the performance of bug localization.
- Bug localization, Deep learning, Lexical mismatch, Machine translation