Abstract
In this paper, we propose a filter-and-refine string join algorithm. While the filtering phase can rapidly prune away strings that are not joinable, the refinement phase employs a comprehensive algorithm to remove the remaining false alarms. The efficiency of the proposed scheme lies in the use of the precedence count matrix (PCM) for computing the edit distance between two sequences. With PCM, the complexity of sequence comparison is a constant time. We also evaluated the proposed sequence join algorithm, and our study shows that it outperforms the known techniques.
| Original language | English |
|---|---|
| Pages (from-to) | 345-348 |
| Journal | Proceedings of the International Conference on Scientific and Statistical Database Management, SSDBM |
| Volume | 16 |
| Publication status | Published - 2004 |
| Externally published | Yes |
| Event | Proceedings - 16th International Conference on Scientific and Statistical Databse Management, SSDBM 2004 - Santorini Island, Greece Duration: 21 Jun 2004 → 23 Jun 2004 |
Fingerprint
Dive into the research topics of 'String recedence count matrix'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver