Efficient phrase querying with common phrase index

Research output: Journal Publications and Reviews (RGC: 21, 22, 62)21_Publication in refereed journalpeer-review

8 Scopus Citations
View graph of relations

Author(s)

  • Matthew Chang
  • Chung Keung Poon

Related Research Unit(s)

Detail(s)

Original languageEnglish
Pages (from-to)756-769
Journal / PublicationInformation Processing and Management
Volume44
Issue number2
Publication statusPublished - Mar 2008

Abstract

In this paper, we propose a common phrase index as an efficient index structure to support phrase queries in a very large text database. Our structure is an extension of previous index structures for phrases and achieves better query efficiency with modest extra storage cost. Further improvement in efficiency can be attained by implementing our index according to our observation of the dynamic nature of common word set. In experimental evaluation, a common phrase index using 255 common words has an improvement of about 11% and 62% in query time for the overall and large queries (queries of long phrases) respectively over an auxiliary nextword index. Moreover, it has only about 19% extra storage cost. Compared with an inverted index, our improvement is about 72% and 87% for the overall and large queries respectively. We also propose to implement a common phrase index with dynamic update feature. Our experiments show that more improvement in time efficiency can be achieved. © 2007 Elsevier Ltd. All rights reserved.

Research Area(s)

  • Auxiliary nextword index, Common phrase index, Indexing, Inverted index, Phrase query evaluation, Query evaluation

Citation Format(s)

Efficient phrase querying with common phrase index. / Chang, Matthew; Poon, Chung Keung.

In: Information Processing and Management, Vol. 44, No. 2, 03.2008, p. 756-769.

Research output: Journal Publications and Reviews (RGC: 21, 22, 62)21_Publication in refereed journalpeer-review