Bug Localization with Semantic and Structural Features using Convolutional Neural Network and Cascade Forest
Research output: Chapters, Conference Papers, Creative and Literary Works › RGC 32 - Refereed conference paper (with host publication) › peer-review
Author(s)
Related Research Unit(s)
Detail(s)
Original language | English |
---|---|
Title of host publication | Proceedings of the 22nd International Conference on Evaluation and Assessment in Software Engineering 2018, EASE 2018 |
Publisher | Association for Computing Machinery |
ISBN (print) | 9781450364034 |
Publication status | Published - Jun 2018 |
Publication series
Name | ACM International Conference Proceeding Series |
---|
Conference
Title | 22nd Evaluation and Assessment in Software Engineering Conference (EASE 2018) |
---|---|
Location | University of Canterbury |
Place | New Zealand |
City | Christchurch |
Period | 28 - 29 June 2018 |
Link(s)
Abstract
Background: Correctly localizing buggy files for bug reports together with their semantic and structural information is a crucial task, which would essentially improve the accuracy of bug localization techniques. Aims: To empirically evaluate and demonstrate the effects of both semantic and structural information in bug reports and source files on improving the performance of bug localization, we propose CNN_Forest involving convolutional neural network and ensemble of random forests that have excellent performance in the tasks of semantic parsing and structural information extraction. Method: We first employ convolutional neural network with multiple filters and an ensemble of random forests with multi-grained scanning to extract semantic and structural features from the word vectors derived from bug reports and source files. And a subsequent cascade forest (a cascade of ensembles of random forests) is used to further extract deeper features and observe the correlated relationships between bug reports and source files. CNN Forest is then empirically evaluated over 10,754 bug reports extracted from AspectJ, Eclipse UI, JDT, SWT, and Tomcat projects. Results: The experiments empirically demonstrate the significance of including semantic and structural information in bug localization, and further show that the proposed CNN_Forest achieves higher Mean Average Precision and Mean Reciprocal Rank measures than the best results of the four current state-of-the-art approaches (NP-CNN, LR+WE, DNNLOC, and BugLocator). Conclusion: CNN_Forest is capable of defining the correlated relationships between bug reports and source files, and we empirically show that semantic and structural information in bug reports and source files are crucial in improving bug localization.
Research Area(s)
- Bug localization, Cascade forest, Convolutional neural network, Semantic information, Structural information, Word embedding
Citation Format(s)
Bug Localization with Semantic and Structural Features using Convolutional Neural Network and Cascade Forest. / Xiao, Yan; Keung, Jacky; Mi, Qing et al.
Proceedings of the 22nd International Conference on Evaluation and Assessment in Software Engineering 2018, EASE 2018. Association for Computing Machinery, 2018. (ACM International Conference Proceeding Series).
Proceedings of the 22nd International Conference on Evaluation and Assessment in Software Engineering 2018, EASE 2018. Association for Computing Machinery, 2018. (ACM International Conference Proceeding Series).
Research output: Chapters, Conference Papers, Creative and Literary Works › RGC 32 - Refereed conference paper (with host publication) › peer-review