Query-oriented unsupervised multi-document summarization via deep learning model

Research output: Journal Publications and Reviews (RGC: 21, 22, 62)21_Publication in refereed journalpeer-review

58 Scopus Citations
View graph of relations

Author(s)

  • Sheng-Hua Zhong
  • Yan Liu
  • Bin Li
  • Jing Long

Detail(s)

Original languageEnglish
Article number10053
Pages (from-to)8146-8155
Journal / PublicationExpert Systems with Applications
Volume42
Issue number21
Online published15 Jun 2015
Publication statusPublished - 30 Nov 2015

Abstract

Abstract Capturing the compositional process from words to documents is a key challenge in natural language processing and information retrieval. Extractive style query-oriented multi-document summarization generates a summary by extracting a proper set of sentences from multiple documents based on pre-given query. This paper proposes a novel document summarization framework based on deep learning model, which has been shown outstanding extraction ability in many real-world applications. The framework consists of three parts: concepts extraction, summary generation, and reconstruction validation. A new query-oriented extraction technique is proposed to extract information distributed in multiple documents. Then, the whole deep architecture is fine-tuned by minimizing the information loss in reconstruction validation. According to the concepts extracted from deep architecture layer by layer, dynamic programming is used to seek most informative set of sentences for the summary. Experiment on three benchmark datasets (DUC 2005, 2006, and 2007) assess and confirm the effectiveness of the proposed framework and algorithms. Experiment results show that the proposed method outperforms state-of-the-art extractive summarization approaches. Moreover, we also provide the statistical analysis of query words based on Amazon's Mechanical Turk (MTurk) crowdsourcing platform. There exists underlying relationships from topic words to the content which can contribute to summarization task.

Research Area(s)

  • Deep learning, Multi-document, Neocortex simulation, Query-oriented summarization

Citation Format(s)

Query-oriented unsupervised multi-document summarization via deep learning model. / Zhong, Sheng-Hua; Liu, Yan; Li, Bin et al.

In: Expert Systems with Applications, Vol. 42, No. 21, 10053, 30.11.2015, p. 8146-8155.

Research output: Journal Publications and Reviews (RGC: 21, 22, 62)21_Publication in refereed journalpeer-review