Clustering tweets using Wikipedia concepts
Research output: Chapters, Conference Papers, Creative and Literary Works › RGC 32 - Refereed conference paper (with host publication) › peer-review
Author(s)
Related Research Unit(s)
Detail(s)
Original language | English |
---|---|
Title of host publication | Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC'14) |
Publisher | European Language Resources Association (ELRA) |
Pages | 2262-2267 |
ISBN (print) | 9782951740884 |
Publication status | Published - May 2014 |
Conference
Title | 9th International Conference on Language Resources and Evaluation (LREC 2014) |
---|---|
Location | Harpa Concert Hall and Conference Center |
Place | Iceland |
City | Reykjavik |
Period | 26 - 31 May 2014 |
Link(s)
Abstract
Two challenging issues are notable in tweet clustering. Firstly, the sparse data problem is serious since no tweet can be longer than 140 characters. Secondly, synonymy and polysemy are rather common because users intend to present a unique meaning with a great number of manners in tweets. Enlightened by the recent research which indicates Wikipedia is promising in representing text, we exploit Wikipedia concepts in representing tweets with concept vectors. We address the polysemy issue with a Bayesian model, and the synonymy issue by exploiting the Wikipedia redirections. To further alleviate the sparse data problem, we further make use of three types of out-links in Wikipedia. Evaluation on a twitter dataset shows that the concept model outperforms the traditional VSM model in tweet clustering.
Research Area(s)
- Tweet clustering, Tweet representation, Wikipedia concept
Bibliographic Note
Full text of this publication does not contain sufficient affiliation information. With consent from the author(s) concerned, the Research Unit(s) information for this record is based on the existing academic department affiliation of the author(s).
Citation Format(s)
Clustering tweets using Wikipedia concepts. / Tang, Guoyu; Xia, Yunqing; Wang, Weizhi et al.
Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC'14). European Language Resources Association (ELRA), 2014. p. 2262-2267.
Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC'14). European Language Resources Association (ELRA), 2014. p. 2262-2267.
Research output: Chapters, Conference Papers, Creative and Literary Works › RGC 32 - Refereed conference paper (with host publication) › peer-review