Abstract
Large language models (LLMs) have demonstrated their capabilities across various natural language processing (NLP) tasks. Their potential in e-commerce is also substantial, evidenced by existing implementations in scenarios such as platform search and recommender systems. One obstinate concern associated with LLMs is the factuality issue (e.g., hallucination), which is urgent in e-commerce due to its significant impact on user experience and revenue. While some methods aim to evaluate the factuality of LLMs, issues such as lack of objectivity, high consumption, and lack of domain expertise arise. To this end, leveraging a collected knowledge graph (KG) as a reliable source, we propose ECKGBench, a question-answering dataset to assess LLMs' capacity in e-commerce. Specifically, each question is automatically generated based on one KG triple through a standardized pipeline, guaranteeing evaluation quality and reliability. We evaluate advanced LLMs using ECKGBench and provide insights into experimental results. The dataset is available online at∼ https://github.com/OpenStellarTeam/ECKGBench. © 2025 Copyright held by the owner/author(s).
| Original language | English |
|---|---|
| Title of host publication | CIKM '25 - Proceedings of the 34th ACM International Conference on Information and Knowledge Management |
| Publisher | Association for Computing Machinery |
| Pages | 6461-6465 |
| ISBN (Print) | 9798400720406 |
| DOIs | |
| Publication status | Published - Nov 2025 |
| Event | 34th ACM International Conference on Information and Knowledge Management (CIKM 2025) - COEX, Seoul, Korea, Republic of Duration: 10 Nov 2025 → 14 Nov 2025 https://cikm2025.org/ |
Publication series
| Name | CIKM - Proceedings of the ACM International Conference on Information and Knowledge Management |
|---|
Conference
| Conference | 34th ACM International Conference on Information and Knowledge Management (CIKM 2025) |
|---|---|
| Abbreviated title | CIKM '25 |
| Place | Korea, Republic of |
| City | Seoul |
| Period | 10/11/25 → 14/11/25 |
| Internet address |
Bibliographical note
Full text of this publication does not contain sufficient affiliation information. With consent from the author(s) concerned, the Research Unit(s) information for this record is based on the existing academic department affiliation of the author(s).Research Keywords
- e-commerce
- factuality evaluation
- large language models
Fingerprint
Dive into the research topics of 'ECKGBench: Benchmarking Large Language Models in E-commerce Leveraging Knowledge Graph'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver