Short text clustering by finding core terms

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

40 Scopus Citations
View graph of relations

Author(s)

  • Xingliang Ni
  • Xiaojun Quan
  • Zhi Lu
  • Liu Wenyin
  • Bei Hua

Related Research Unit(s)

Detail(s)

Original languageEnglish
Pages (from-to)345-365
Journal / PublicationKnowledge and Information Systems
Volume27
Issue number3
Publication statusPublished - Jun 2011

Abstract

A new clustering strategy, TermCut, is presented to cluster short text snippets by finding core terms in the corpus. We model the collection of short text snippets as a graph in which each vertex represents a piece of short text snippet and each weighted edge between two vertices measures the relationship between the two vertices. TermCut is then applied to recursively select a core term and bisect the graph such that the short text snippets in one part of the graph contain the term, whereas those snippets in the other part do not. We apply the proposed method on different types of short text snippets, including questions and search results. Experimental results show that the proposed method outperforms state-of-the-art clustering algorithms for clustering short text snippets. © 2010 Springer-Verlag London Limited.

Research Area(s)

  • Clustering, Short text clustering, TermCut

Citation Format(s)

Short text clustering by finding core terms. / Ni, Xingliang; Quan, Xiaojun; Lu, Zhi et al.
In: Knowledge and Information Systems, Vol. 27, No. 3, 06.2011, p. 345-365.

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review