VCSUM: A Versatile Chinese Meeting Summarization Dataset

Research output: Chapters, Conference Papers, Creative and Literary Works (RGC: 12, 32, 41, 45)32_Refereed conference paper (with host publication)peer-review

View graph of relations

Author(s)

  • Mingjie Zhan
  • Zhaohui Hou
  • Ding Liang

Detail(s)

Original languageEnglish
Title of host publicationFindings of the Association for Computational Linguistics, ACL 2023
EditorsAnna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
PublisherAssociation for Computational Linguistics
Pages6065–6079
ISBN (Print)9781959429623
Publication statusPublished - Jul 2023

Publication series

NameProceedings of the Annual Meeting of the Association for Computational Linguistics
ISSN (Print)0736-587X

Conference

Title61st Annual Meeting of the Association for Computational Linguistics (ACL 2023)
LocationWestin Harbour Castle
PlaceCanada
CityToronto
Period9 - 14 July 2023

Abstract

Compared to news and chat summarization, the development of meeting summarization is hugely decelerated by the limited data. To this end, we introduce a versatile Chinese meeting summarization dataset, dubbed VCSUM, consisting of 239 real-life meetings, with a total duration of over 230 hours. We claim our dataset is versatile because we provide the annotations of topic segmentation, headlines, segmentation summaries, overall meeting summaries, and salient sentences for each meeting transcript. As such, the dataset can adapt to various summarization tasks or methods, including segmentation-based summarization, multi-granularity summarization and retrieval-then-generate summarization. Our analysis confirms the effectiveness and robustness of VCSUM. We also provide a set of benchmark models regarding different downstream summarization tasks on VCSUM to facilitate further research. The dataset and code will be released at https://github.com/ hahahawu/VCSum. ©2023 Association for Computational Linguistics.

Citation Format(s)

VCSUM: A Versatile Chinese Meeting Summarization Dataset. / Wu, Han; Zhan, Mingjie; Tan, Haochen et al.
Findings of the Association for Computational Linguistics, ACL 2023. ed. / Anna Rogers; Jordan Boyd-Graber; Naoaki Okazaki. Association for Computational Linguistics, 2023. p. 6065–6079 (Proceedings of the Annual Meeting of the Association for Computational Linguistics).

Research output: Chapters, Conference Papers, Creative and Literary Works (RGC: 12, 32, 41, 45)32_Refereed conference paper (with host publication)peer-review