TY - JOUR
T1 - A two-stage information filtering based on rough decision rule and pattern mining
AU - Zhou, Xujuan
AU - Li, Yuefeng
AU - Bruza, Peter
AU - Xu, Yue
AU - Lau, Raymond
PY - 2010/11
Y1 - 2010/11
N2 - Information Overload and Mismatch are two fundamental problems affecting the effectiveness of information filtering systems. Even though both term-based and patternbased approaches have been proposed to address the problems of overload and mismatch, neither of these approaches alone can provide a satisfactory solution to address these problems. This paper presents a novel two-stage information filtering model which combines the merits of term-based and pattern-based approaches to effectively filter sheer volume of information. In particular, the first filtering stage is supported by a novel rough analysis model which efficiently removes a large number of irrelevant documents, thereby addressing the overload problem. The second filtering stage is empowered by a semantically rich pattern taxonomy mining model which effectively fetches incoming documents according to the specific information needs of a user, thereby addressing the mismatch problem. The experimental results based on the RCV1 corpus show that the proposed twostage filtering model significantly outperforms the both termbased and pattern-based information filtering models. © 2010 ACADEMY PUBLISHER.
AB - Information Overload and Mismatch are two fundamental problems affecting the effectiveness of information filtering systems. Even though both term-based and patternbased approaches have been proposed to address the problems of overload and mismatch, neither of these approaches alone can provide a satisfactory solution to address these problems. This paper presents a novel two-stage information filtering model which combines the merits of term-based and pattern-based approaches to effectively filter sheer volume of information. In particular, the first filtering stage is supported by a novel rough analysis model which efficiently removes a large number of irrelevant documents, thereby addressing the overload problem. The second filtering stage is empowered by a semantically rich pattern taxonomy mining model which effectively fetches incoming documents according to the specific information needs of a user, thereby addressing the mismatch problem. The experimental results based on the RCV1 corpus show that the proposed twostage filtering model significantly outperforms the both termbased and pattern-based information filtering models. © 2010 ACADEMY PUBLISHER.
KW - Information filtering
KW - Pattern mining
KW - Rough set theory
KW - User profiles
UR - http://www.scopus.com/inward/record.url?scp=84863359000&partnerID=8YFLogxK
UR - https://www.scopus.com/record/pubmetrics.uri?eid=2-s2.0-84863359000&origin=recordpage
U2 - 10.4304/jetwi.2.4.326-332
DO - 10.4304/jetwi.2.4.326-332
M3 - RGC 21 - Publication in refereed journal
SN - 1798-0461
VL - 2
SP - 326
EP - 332
JO - Journal of Emerging Technologies in Web Intelligence
JF - Journal of Emerging Technologies in Web Intelligence
IS - 4
ER -