TY - GEN
T1 - Building implicit links from content for forum search
AU - Xu, Gu
AU - Ma, Wei-Ying
N1 - Publication details (e.g. title, author(s), publication statuses and dates) are captured on an “AS IS” and “AS AVAILABLE” basis at the time of record harvesting from the data source. Suggestions for further amendments or supplementary information can be sent to [email protected].
PY - 2006
Y1 - 2006
N2 - The objective of Web forums is to create a shared space for open communications and discussions of specific topics and issues. The tremendous information behind forum sites is not fully-utilized yet. Most links between forum pages are automatically created, which means the link-based ranking algorithm cannot be applied efficiently. In this paper, we proposed a novel ranking algorithm which tries to introduce the content information into link-based methods as implicit links. The basic idea is derived from the more focused random surfer: the surfer may more likely jump to a page which is similar to what he is reading currently. In this manner, we are allowed to introduce the content similarities into the link graph as a personalization bias. Our method, named Fine-grained Rank (FGRank), can be efficiently computed based on an automatically generated topic hierarchy. Not like the topic-sensitive PageRank, our method only need to compute single PageRank score for each page. Another contribution of this paper is to present a very efficient algorithm for automatically generating topic hierarchy and map each page in a large-scale collection onto the computed hierarchy. The experimental results show that the proposed method can improve retrieval performance, and reveal that content-based link graph is also important compared with the hyper-link graph. Copyright 2006 ACM.
AB - The objective of Web forums is to create a shared space for open communications and discussions of specific topics and issues. The tremendous information behind forum sites is not fully-utilized yet. Most links between forum pages are automatically created, which means the link-based ranking algorithm cannot be applied efficiently. In this paper, we proposed a novel ranking algorithm which tries to introduce the content information into link-based methods as implicit links. The basic idea is derived from the more focused random surfer: the surfer may more likely jump to a page which is similar to what he is reading currently. In this manner, we are allowed to introduce the content similarities into the link graph as a personalization bias. Our method, named Fine-grained Rank (FGRank), can be efficiently computed based on an automatically generated topic hierarchy. Not like the topic-sensitive PageRank, our method only need to compute single PageRank score for each page. Another contribution of this paper is to present a very efficient algorithm for automatically generating topic hierarchy and map each page in a large-scale collection onto the computed hierarchy. The experimental results show that the proposed method can improve retrieval performance, and reveal that content-based link graph is also important compared with the hyper-link graph. Copyright 2006 ACM.
KW - Categorization
KW - Clustering
KW - Forum search
KW - Hierarchy generation
KW - PageRank
UR - http://www.scopus.com/inward/record.url?scp=33750479390&partnerID=8YFLogxK
UR - https://www.scopus.com/record/pubmetrics.uri?eid=2-s2.0-33750479390&origin=recordpage
U2 - 10.1145/1148170.1148224
DO - 10.1145/1148170.1148224
M3 - RGC 32 - Refereed conference paper (with host publication)
SN - 1595933697
SN - 9781595933690
VL - 2006
T3 - Proceedings of the Twenty-Ninth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
SP - 300
EP - 307
BT - Proceedings of the Twenty-Ninth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
PB - Association for Computing Machinery
T2 - 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
Y2 - 6 August 2006 through 11 August 2006
ER -