A two-stage text mining model for information filtering
Research output: Chapters, Conference Papers, Creative and Literary Works (RGC: 12, 32, 41, 45) › 32_Refereed conference paper (with host publication) › peer-review
Author(s)
Related Research Unit(s)
Detail(s)
Original language | English |
---|---|
Title of host publication | International Conference on Information and Knowledge Management, Proceedings |
Pages | 1023-1032 |
Publication status | Published - 2008 |
Conference
Title | 17th ACM Conference on Information and Knowledge Management, CIKM'08 |
---|---|
Place | United States |
City | Napa Valley, CA |
Period | 26 - 30 October 2008 |
Link(s)
Abstract
Mismatch and overload are the two fundamental issues regarding the effectiveness of information filtering. Both term-based and pattern (phrase) based approaches have been employed to address these issues. However, they all suffer from some limitations with regard to effectiveness. This paper proposes a novel solution that includes two stages: an initial topic filtering stage followed by a stage involving pattern taxonomy mining. The objective of the first stage is to address mismatch by quickly filtering out probable irrelevant documents. The threshold used in the first stage is motivated theoretically. The objective of the second stage is to address overload by apply pattern mining techniques to rationalize the data relevance of the reduced document set after the first stage. Substantial experiments on RCV1 show that the proposed solution achieves encouraging performance. Copyright 2008 by ACM.
Research Area(s)
- Decision rules, Information filtering, Text mining, Thresholds, Weighting schema
Citation Format(s)
A two-stage text mining model for information filtering. / Li, Yuefeng; Zhou, Xujuan; Bruza, Peter et al.
International Conference on Information and Knowledge Management, Proceedings. 2008. p. 1023-1032.
International Conference on Information and Knowledge Management, Proceedings. 2008. p. 1023-1032.
Research output: Chapters, Conference Papers, Creative and Literary Works (RGC: 12, 32, 41, 45) › 32_Refereed conference paper (with host publication) › peer-review