Abstract
Graded reading tailors text difficulty to learners’ proficiency by producing multiple versions of the same content—an approach long embraced in language education but still dependent on labor-intensive, expert-driven adaptation. In this paper, we introduce the task of Chinese Graded Document Simplification (CGDS) for non-native learners, which seeks to automate the creation of multi-level reading materials in accordance with established proficiency standards. Guided by the three stages of the Hanyu Shuiping Kaoshi (HSK) 3.0 framework (Levels 1–3 for Advanced, Levels 4–6 for Intermediate, and Levels 7–9 for Beginner learners), we propose Large Language Model for Chinese Graded Document Simplification (LLM4CGDS), a rule-guided, large language model (LLM)-based framework that integrates HSK-level readability constraints and external knowledge retrieval to control document-level simplification without requiring supervised fine-tuning. To foster further research, we construct two complementary datasets: Journey to the West Document Simplification (JWDS) and Multi-Domain Document Simplification (MDDS) that covering diverse genres and difficulty levels. Experimental evaluation on two datasets demonstrates that LLM4CGDS substantially outperforms direct prompting of state-of-the-art LLMs in both readability control and meaning preservation. © 2026 Elsevier Ltd.
| Original language | English |
|---|---|
| Article number | 113905 |
| Number of pages | 13 |
| Journal | Engineering Applications of Artificial Intelligence |
| Volume | 169 |
| Online published | 7 Feb 2026 |
| DOIs | |
| Publication status | Published - 1 Apr 2026 |
Funding
This research is partially supported by the National Natural Science Foundation of China under grants (62076217), and the National Language Commission (ZDI145-71).
Research Keywords
- Graded reading
- Text simplification
- Large language modeling
- Hanyu shuiping kaoshi
Fingerprint
Dive into the research topics of 'LLM4CGDS: Large language model-based agents for Chinese graded document simplification'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver