Identifying Textual Features of High-Quality Questions : An Empirical Study on Stack Overflow

Research output: Chapters, Conference Papers, Creative and Literary Works (RGC: 12, 32, 41, 45)32_Refereed conference paper (with ISBN/ISSN)

View graph of relations

Related Research Unit(s)

Detail(s)

Original languageEnglish
Title of host publicationProceedings - 24th Asia-Pacific Software Engineering Conference, APSEC 2017
EditorsJian Lv, He (Jason) Zhang, Mike Hinchey, Xiao Liu
PublisherIEEE
Pages636-641
ISBN (Electronic)978-1-5386-3681-7
Publication statusPublished - Dec 2017

Publication series

NameAsia-Pacific Software Engineering Conference
PublisherIEEE
ISSN (Print)1530-1362

Conference

Title24th Asia-Pacific Software Engineering Conference, APSEC 2017
PlaceChina
CityNanjing, Jiangsu
Period4 - 8 December 2017

Abstract

Background: Stack Overflow (SO) is a programming-specific Q&A website that serves as a valuable repository of software engineering knowledge. For SO members, formulating a good question is the first step towards eliciting satisfactory responses. Aims: To guide SO members on how to make a good question, we conduct an empirical study using the publicly available Stack Overflow Data Dump for the period of 2008-2016. Method: We first choose 25 features along 5 dimensions to represent the textual characteristics that we are interested in. Making use of the Boruta algorithm, we then capture all features that are either strongly or weakly relevant to the question quality. Results: The results show that the number of tags and code snippets are the most discriminative features, whereas there is only a weak correlation between the question quality and the sentiment-related factors. Based on the empirical evidence, we provide useful and usable suggestions to SO members on how to optimize their questions. Conclusions: We consider that our findings will provide SO members with a better understanding of the patterns behind high-quality questions, this is to support effective and efficient utilization of Q&A websites as the ultimate goal.

Research Area(s)

  • Boruta algorithm, empirical software engineering, Q&A website, Stack Overflow, textual feature

Citation Format(s)

Identifying Textual Features of High-Quality Questions : An Empirical Study on Stack Overflow. / Mi, Qing; Gao, Yujin; Keung, Jacky; Xiao, Yan; Mensah, Solomon.

Proceedings - 24th Asia-Pacific Software Engineering Conference, APSEC 2017. ed. / Jian Lv; He (Jason) Zhang; Mike Hinchey; Xiao Liu. IEEE, 2017. p. 636-641 (Asia-Pacific Software Engineering Conference ).

Research output: Chapters, Conference Papers, Creative and Literary Works (RGC: 12, 32, 41, 45)32_Refereed conference paper (with ISBN/ISSN)