TY - JOUR
T1 - Automatic categorization of questions for user-interactive question answering
AU - Song, Wanpeng
AU - Wenyin, Liu
AU - Gu, Naijie
AU - Quan, Xiaojun
AU - Hao, Tianyong
PY - 2011/3
Y1 - 2011/3
N2 - Question categorization, which suggests one of a set of predefined categories to a user's question according to the question's topic or content, is a useful technique in user-interactive question answering systems. In this paper, we propose an automatic method for question categorization in a user-interactive question answering system. This method includes four steps: feature space construction, topic-wise words identification and weighting, semantic mapping, and similarity calculation. We firstly construct the feature space based on all accumulated questions and calculate the feature vector of each predefined category which contains certain accumulated questions. When a new question is posted, the semantic pattern of the question is used to identify and weigh the important words of the question. After that, the question is semantically mapped into the constructed feature space to enrich its representation. Finally, the similarity between the question and each category is calculated based on their feature vectors. The category with the highest similarity is assigned to the question. The experimental results show that our proposed method achieves good categorization precision and outperforms the traditional categorization methods on the selected test questions. © 2010 Elsevier Ltd. All rights reserved.
AB - Question categorization, which suggests one of a set of predefined categories to a user's question according to the question's topic or content, is a useful technique in user-interactive question answering systems. In this paper, we propose an automatic method for question categorization in a user-interactive question answering system. This method includes four steps: feature space construction, topic-wise words identification and weighting, semantic mapping, and similarity calculation. We firstly construct the feature space based on all accumulated questions and calculate the feature vector of each predefined category which contains certain accumulated questions. When a new question is posted, the semantic pattern of the question is used to identify and weigh the important words of the question. After that, the question is semantically mapped into the constructed feature space to enrich its representation. Finally, the similarity between the question and each category is calculated based on their feature vectors. The category with the highest similarity is assigned to the question. The experimental results show that our proposed method achieves good categorization precision and outperforms the traditional categorization methods on the selected test questions. © 2010 Elsevier Ltd. All rights reserved.
KW - Question answering
KW - Question categorization
KW - Text similarity calculation
UR - http://www.scopus.com/inward/record.url?scp=79951946311&partnerID=8YFLogxK
UR - https://www.scopus.com/record/pubmetrics.uri?eid=2-s2.0-79951946311&origin=recordpage
U2 - 10.1016/j.ipm.2010.03.002
DO - 10.1016/j.ipm.2010.03.002
M3 - RGC 21 - Publication in refereed journal
SN - 0306-4573
VL - 47
SP - 147
EP - 156
JO - Information Processing and Management
JF - Information Processing and Management
IS - 2
ER -