Abstract
Faceted search on web pages needs exact facets. However, it is difficult to extract facets exactly from web pages because the web pages are unstructured and lack of facet information. Therefore, facet extraction is a key to faceted search. This paper proposed a method of extracting facets automatically from unstructured web pages to improve the faceted search on web. The Multidimensional Semantic Index (MDSI) of web pages is constructed by mining all kinds of semantic relations among the words from web pages, which creates a semantic-rich index for web pages. In MDSI, the differently dimensional semantic indexes are bridged by mining the semantic mapping between them. Based on the MDSI of web pages, the facets are extracted by analyzing semantic mapping relations in MDSI. To validate the effect of the proposed method, two datasets are constructed and the experimental results show that the proposed method is feasible and comparatively precise. © 2012 IEEE.
Original language | English |
---|---|
Title of host publication | Proceedings - 2012 8th International Conference on Semantics, Knowledge and Grids, SKG 2012 |
Pages | 64-71 |
DOIs | |
Publication status | Published - 2012 |
Event | 2012 8th International Conference on Semantics, Knowledge and Grids, SKG 2012 - Beijing, China Duration: 22 Oct 2012 → 24 Oct 2012 |
Conference
Conference | 2012 8th International Conference on Semantics, Knowledge and Grids, SKG 2012 |
---|---|
Country/Territory | China |
City | Beijing |
Period | 22/10/12 → 24/10/12 |
Research Keywords
- facet extraction
- faceted search
- multidimensional semantic index
- semantic mapping