String searching engine for virus scanning

Research output: Journal Publications and Reviews (RGC: 21, 22, 62)21_Publication in refereed journalpeer-review

14 Scopus Citations
View graph of relations

Author(s)

  • Derek Pao
  • Xing Wang
  • Xiaoran Wang
  • Cong Cao
  • Yuesheng Zhu

Related Research Unit(s)

Detail(s)

Original languageEnglish
Article number5669261
Pages (from-to)1596-1609
Journal / PublicationIEEE Transactions on Computers
Volume60
Issue number11
Publication statusPublished - 2011

Abstract

A memory-efficient hardware string searching engine for antivirus applications is presented. The proposed QSV method is based on quick sampling of the input stream against fixed-length pattern prefixes, and on-demand verification of variable-length pattern suffixes. Patterns handled by the QSV method are required to have at least 16 bytes, and possess distinct 16-byte prefixes. The latter requirement can be fulfilled by a preprocessing procedure. The search engine uses the pipelined Aho-Corasick (P-AC) architecture developed by the first author to process 4 to 15-byte short patterns and a small number of exception cases. Our design was evaluated using the ClamAV virus database having 82,888 strings with a total size that exceeds 8 Mbyte. In terms of byte count, 99.3 percent of the pattern set is handled by the QSV method and 0.7 percent of the pattern set is handled by P-AC. A pattern with distinct 16-byte prefix only occupies up to three lookup table entries in QSV. The overall memory cost of our system is about 1.4 Mbyte, i.e., 1.4 bit per character of the ClamAV pattern set. The proposed method is memory-based, hence, updates to the pattern set can be accommodated by modifying the contents of the lookup tables without reconfiguring the hardware circuits. © 2006 IEEE.

Research Area(s)

  • antivirus system, embedded system., String searching, system security

Citation Format(s)

String searching engine for virus scanning. / Pao, Derek; Wang, Xing; Wang, Xiaoran; Cao, Cong; Zhu, Yuesheng.

In: IEEE Transactions on Computers, Vol. 60, No. 11, 5669261, 2011, p. 1596-1609.

Research output: Journal Publications and Reviews (RGC: 21, 22, 62)21_Publication in refereed journalpeer-review