Abstract
The advent of natural language understanding (NLU) benchmarks for English, such as GLUE and SuperGLUE allows new NLU models to be evaluated across a diverse set of tasks. These comprehensive benchmarks have facilitated a broad range of research and applications in natural language processing (NLP). The problem, however, is that most such benchmarks are limited to English, which has made it difficult to replicate many of the successes in English NLU for other languages. To help remedy this issue, we introduce the first large-scale Chinese Language Understanding Evaluation (CLUE) benchmark. CLUE is an open-ended, community-driven project that brings together 9 tasks spanning several well-established single-sentence/sentence-pair classification tasks, as well as machine reading comprehension, all on original Chinese text. To establish results on these tasks, we report scores using an exhaustive set of current state-of-the-art pre-trained Chinese models (9 in total). We also introduce a number of supplementary datasets and additional tools to help facilitate further progress on Chinese NLU. Our benchmark is released at https://www.CLUEbenchmarks.com.
| Original language | English |
|---|---|
| Title of host publication | Proceedings of the 28th International Conference on Computational Linguistics |
| Editors | Donia Scott, Nuria Bel, Chengqing Zong |
| Publisher | International Committee on Computational Linguistics |
| Pages | 4762-4772 |
| Number of pages | 11 |
| ISBN (Print) | 9781952148279 |
| DOIs | |
| Publication status | Published - Dec 2020 |
| Externally published | Yes |
| Event | 28th International Conference on Computational Linguistics, COLING 2020 - Virtual, Online, Spain Duration: 8 Dec 2020 → 13 Dec 2020 https://aclanthology.org/2020.coling-main |
Conference
| Conference | 28th International Conference on Computational Linguistics, COLING 2020 |
|---|---|
| Place | Spain |
| City | Virtual, Online |
| Period | 8/12/20 → 13/12/20 |
| Internet address |
Bibliographical note
Research Unit(s) information for this publication is provided by the author(s) concerned.Funding
The authors would like to thank everyone who has contributed their datasets to CLUE. We are also grateful to the annotators and engineers who have spent much of their time and effort helping with the creation of the CLUE benchmark. Special thanks to the following companies and organizations: OneConnect Financial Technology Co., Ltd, OpenBayes Co., Ltd, AI-Indeed.com, Alibaba Cloud Computing, Joint Laboratory of HIT and iFLYTEK Research (HFL). Research supported with Cloud TPUs from Google’s TensorFlow Research Cloud (TFRC).
Publisher's Copyright Statement
- This full text is made available under CC-BY 4.0. https://creativecommons.org/licenses/by/4.0/