AggregateRank: Bringing order to Web sites

Guang Feng, Tie-Yan Liu, Ying Wang, Ying Bao, Zhiming Ma, Xu-Dong Zhang, Wei-Ying Ma

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

21 Citations (Scopus)

Abstract

Since the website is one of the most important organizational structures of the Web, how to effectively rank websites has been essential to many Web applications, such as Web search and crawling. In order to get the ranks of websites, researchers used to describe the inter-connectivity among websites with a so-called HostGraph in which the nodes denote websites and the edges denote linkages between websites (if and only if there are hyperlinks from the pages in one website to the pages in the other, there will be an edge between these two websites), and then adopted the random walk model in the HostGraph. However, as pointed in this paper, the random walk over such a HostGraph is not reasonable because it is not in accordance with the browsing behavior of web surfers. Therefore, the derivate rank cannot represent the true probability of visiting the corresponding website. In this work, we mathematically proved that the probability of visiting a website by the random web surfer should be equal to the sum of the PageRank values of the pages inside that website. Nevertheless, since the number of web pages is much larger than that of websites, it is not feasible to base the calculation of the ranks of websites on the calculation of PageRank. To tackle this problem, we proposed a novel method named AggregateRank rooted in the theory of stochastic complement, which cannot only approximate the sum of PageRank accurately, but also have a lower computational complexity than PageRank. Both theoretical analysis and experimental evaluation show that AggregateRank is a better method for ranking websites than previous methods. Copyright 2006 ACM.
Original languageEnglish
Title of host publicationSIGIR '06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
PublisherAssociation for Computing Machinery
Pages75-82
ISBN (Print)9781595933690
DOIs
Publication statusPublished - 2006
Externally publishedYes
Event29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '06) - Seatttle, WA, United States
Duration: 6 Aug 200611 Aug 2006

Publication series

NameProceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval

Conference

Conference29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '06)
PlaceUnited States
CitySeatttle, WA
Period6/08/0611/08/06

Research Keywords

  • AggregateRank
  • Coupling matrix
  • Stochastic complement theory

Fingerprint

Dive into the research topics of 'AggregateRank: Bringing order to Web sites'. Together they form a unique fingerprint.

Cite this