Hierarchical summarization of large documents

Research output: Journal Publications and ReviewsRGC 22 - Publication in policy or professional journal

38 Scopus Citations
View graph of relations

Author(s)

  • Christopher C. Yang
  • Fu Lee Wang

Related Research Unit(s)

Detail(s)

Original languageEnglish
Pages (from-to)887-902
Journal / PublicationJournal of the American Society for Information Science and Technology
Volume59
Issue number6
Publication statusPublished - Apr 2008

Abstract

Many automatic text summarization models have been developed in the last decades. Related research in information science has shown that human abstractors extract sentences for summaries based on the hierarchical structure of documents; however, the existing automatic summarization models do not take into account the human abstractor's behavior of sentence extraction and only consider the document as a sequence of sentences during the process of extraction of sentences as a summary. In general, a document exhibits a well-defined hierarchical structure that can be described as fractalsmathematical objects with a high degree of redundancy. In this article, we introduce the fractal summarization model based on the fractal theory. The important information is captured from the source document by exploring the hierarchical structure and salient features of the document. A condensed version of the document that is informatively close to the source document is produced iteratively using the contractive transformation in the fractal theory. The fractal summarization model is the first attempt to apply fractal theory to document summarization. It significantly improves the divergence of information coverage of summary and the precision of summary. User evaluations have been conducted. Results have indicated that fractal summarization is promising and outperforms current summarization techniques that do not consider the hierarchical structure of documents.

Citation Format(s)

Hierarchical summarization of large documents. / Yang, Christopher C.; Wang, Fu Lee.
In: Journal of the American Society for Information Science and Technology, Vol. 59, No. 6, 04.2008, p. 887-902.

Research output: Journal Publications and ReviewsRGC 22 - Publication in policy or professional journal