Web object indexing using domain knowledge

Muyuan Wang, Zhiwei Li, Lie Lu, Wei-Ying Ma, Naiyao Zhang

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

Abstract

A web object is defined to represent any meaningful object embedded in web pages (e.g. images, music) or pointed to by hyperlinks (e.g. downloadable files). In many cases, users would like to search for information of a certain 'object', rather than a web page containing the query terms. To facilitate web object searching and organizing, in this paper, we propose a novel approach to web object indexing, by discovering its inherent structure information with existed domain knowledge. In our approach, first, Layered LSI spaces are built for a better representation of the hierarchically structured domain knowledge, in order to emphasize the specific semantics and term space in each layer of the domain knowledge. Meanwhile, the web object representation is constructed by hyperlink analysis, and further pruned to remove the noises. Then an optimal matching between the web object and the domain knowledge is performed, in order to pick out the structure attributes of the web object from the knowledge. Finally, the obtained structure attributes are used to re-organize and index the web objects. Our approach also indicates a new promising way to use trust-worthy Deep Web knowledge to help organize dispersive information of Surface Web. Copyright 2005 ACM.
Original languageEnglish
Title of host publicationKDD-2005 - Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
PublisherAssociation for Computing Machinery
Pages294-303
DOIs
Publication statusPublished - 2005
Externally publishedYes
EventKDD-2005: 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - Chicago, IL, United States
Duration: 21 Aug 200524 Aug 2005

Publication series

NameProceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Conference

ConferenceKDD-2005: 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
PlaceUnited States
CityChicago, IL
Period21/08/0524/08/05

Bibliographical note

Publication details (e.g. title, author(s), publication statuses and dates) are captured on an “AS IS” and “AS AVAILABLE” basis at the time of record harvesting from the data source. Suggestions for further amendments or supplementary information can be sent to [email protected].

Research Keywords

  • Confidence propagation
  • Domain knowledge
  • Indexing
  • Information retrieval
  • Latent semantic indexing
  • Link analysis
  • Music indexing
  • Web object

Fingerprint

Dive into the research topics of 'Web object indexing using domain knowledge'. Together they form a unique fingerprint.

Cite this