Short text clustering by finding core terms
Research output: Journal Publications and Reviews › RGC 21 - Publication in refereed journal › peer-review
Author(s)
Related Research Unit(s)
Detail(s)
Original language | English |
---|---|
Pages (from-to) | 345-365 |
Journal / Publication | Knowledge and Information Systems |
Volume | 27 |
Issue number | 3 |
Publication status | Published - Jun 2011 |
Link(s)
Abstract
A new clustering strategy, TermCut, is presented to cluster short text snippets by finding core terms in the corpus. We model the collection of short text snippets as a graph in which each vertex represents a piece of short text snippet and each weighted edge between two vertices measures the relationship between the two vertices. TermCut is then applied to recursively select a core term and bisect the graph such that the short text snippets in one part of the graph contain the term, whereas those snippets in the other part do not. We apply the proposed method on different types of short text snippets, including questions and search results. Experimental results show that the proposed method outperforms state-of-the-art clustering algorithms for clustering short text snippets. © 2010 Springer-Verlag London Limited.
Research Area(s)
- Clustering, Short text clustering, TermCut
Citation Format(s)
Short text clustering by finding core terms. / Ni, Xingliang; Quan, Xiaojun; Lu, Zhi et al.
In: Knowledge and Information Systems, Vol. 27, No. 3, 06.2011, p. 345-365.
In: Knowledge and Information Systems, Vol. 27, No. 3, 06.2011, p. 345-365.
Research output: Journal Publications and Reviews › RGC 21 - Publication in refereed journal › peer-review