TY - GEN
T1 - How well do computers solve math word problems? Large-scale dataset construction and evaluation
AU - Huang, Danqing
AU - Shi, Shuming
AU - Lin, Chin-Yew
AU - Yin, Jian
AU - Ma, Wei-Ying
N1 - Publication details (e.g. title, author(s), publication statuses and dates) are captured on an “AS IS” and “AS AVAILABLE” basis at the time of record harvesting from the data source. Suggestions for further amendments or supplementary information can be sent to [email protected].
PY - 2016
Y1 - 2016
N2 - Recently a few systems for automatically solving math word problems have reported promising results. However, the datasets used for evaluation have limitations in both scale and diversity. In this paper, we build a large-scale dataset which is more than 9 times the size of previous ones, and contains many more problem types. Problems in the dataset are semiautomatically obtained from community question-answering (CQA) web pages. A ranking SVM model is trained to automatically extract problem answers from the answer text provided by CQA users, which significantly reduces human annotation cost. Experiments conducted on the new dataset lead to interesting and surprising results. © 2016 Association for Computational Linguistics.
AB - Recently a few systems for automatically solving math word problems have reported promising results. However, the datasets used for evaluation have limitations in both scale and diversity. In this paper, we build a large-scale dataset which is more than 9 times the size of previous ones, and contains many more problem types. Problems in the dataset are semiautomatically obtained from community question-answering (CQA) web pages. A ranking SVM model is trained to automatically extract problem answers from the answer text provided by CQA users, which significantly reduces human annotation cost. Experiments conducted on the new dataset lead to interesting and surprising results. © 2016 Association for Computational Linguistics.
UR - http://www.scopus.com/inward/record.url?scp=85011931361&partnerID=8YFLogxK
UR - https://www.scopus.com/record/pubmetrics.uri?eid=2-s2.0-85011931361&origin=recordpage
U2 - 10.18653/v1/p16-1084
DO - 10.18653/v1/p16-1084
M3 - RGC 32 - Refereed conference paper (with host publication)
SN - 9781510827585
VL - 2
T3 - 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Long Papers
SP - 887
EP - 896
BT - 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Long Papers
PB - ACL Anthology
T2 - 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016
Y2 - 7 August 2016 through 12 August 2016
ER -