TY - GEN
T1 - On difficulties of cross-lingual transfer with order differences
T2 - 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2019
AU - Ahmad, Wasi Uddin
AU - Zhang, Zhisong
AU - Ma, Xuezhe
AU - Hovy, Eduard
AU - Chang, Kai-Wei
AU - Peng, Nanyun
N1 - Publication details (e.g. title, author(s), publication statuses and dates) are captured on an “AS IS” and “AS AVAILABLE” basis at the time of record harvesting from the data source. Suggestions for further amendments or supplementary information can be sent to <a href="mailto:[email protected]">[email protected]</a>.
PY - 2019/6/1
Y1 - 2019/6/1
N2 - Different languages might have different word orders. In this paper, we investigate cross-lingual transfer and posit that an order-agnostic model will perform better when transferring to distant foreign languages. To test our hypothesis, we train dependency parsers on an English corpus and evaluate their transfer performance on 30 other languages. Specifically, we compare encoders and decoders based on Recurrent Neural Networks (RNNs) and modified self-attentive architectures. The former relies on sequential information while the latter is more flexible at modeling word order. Rigorous experiments and detailed analysis shows that RNN-based architectures transfer well to languages that are close to English, while self-attentive models have better overall cross-lingual transferability and perform especially well on distant languages. © 2019 Association for Computational Linguistics
AB - Different languages might have different word orders. In this paper, we investigate cross-lingual transfer and posit that an order-agnostic model will perform better when transferring to distant foreign languages. To test our hypothesis, we train dependency parsers on an English corpus and evaluate their transfer performance on 30 other languages. Specifically, we compare encoders and decoders based on Recurrent Neural Networks (RNNs) and modified self-attentive architectures. The former relies on sequential information while the latter is more flexible at modeling word order. Rigorous experiments and detailed analysis shows that RNN-based architectures transfer well to languages that are close to English, while self-attentive models have better overall cross-lingual transferability and perform especially well on distant languages. © 2019 Association for Computational Linguistics
UR - http://www.scopus.com/inward/record.url?scp=85084084832&partnerID=8YFLogxK
UR - https://www.scopus.com/record/pubmetrics.uri?eid=2-s2.0-85084084832&origin=recordpage
U2 - 10.18653/v1/N19-1253
DO - 10.18653/v1/N19-1253
M3 - RGC 32 - Refereed conference paper (with host publication)
SN - 9781950737130
VL - 1
T3 - NAACL HLT 2019 - 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference
SP - 2440
EP - 2452
BT - Long and Short Papers
PB - ACL Anthology
Y2 - 2 June 2019 through 7 June 2019
ER -