Memory-Based Architecture for Multicharacter Aho-Corasick String Matching

Research output: Journal Publications and Reviews (RGC: 21, 22, 62)21_Publication in refereed journalpeer-review

7 Scopus Citations
View graph of relations

Author(s)

Related Research Unit(s)

Detail(s)

Original languageEnglish
Pages (from-to)143-154
Journal / PublicationIEEE Transactions on Very Large Scale Integration (VLSI) Systems
Volume26
Issue number1
Online published28 Sep 2017
Publication statusPublished - Jan 2018

Abstract

The Aho-Corasick (AC) string matching algorithm is commonly used in signature-based intrusion detection and antivirus systems. In this paper, we present a hardware implementation of the AC algorithm that can process multiple characters per cycle. State transitions are implemented using table lookup. Hence, updates to the pattern set do not require hardware reconfiguration. A fundamental issue of multicharacter AC string matching is the transition rules explosion problem. We apply the transition rule reduction strategy such that the number of transition rules can be reduced to less than two rules per state in the finite automaton. The memory cost is further optimized by implementing the rule table as near-minimal perfect hash table. We implement the proposed method on a virtex-7 FPGA for a pattern set with 2200 Snort signatures. The normalized memory cost of our design is 13.75 bytes per character of the pattern set when four characters are processed per cycle, and the FPGA can operate at 216 MHz.

Research Area(s)

  • Hardware architecture, perfect hash table, string matching, transition rule reduction, INTRUSION DETECTION, PACKET INSPECTION, EFFICIENT, ALGORITHM