-
Notifications
You must be signed in to change notification settings - Fork 140
- Status:
- Version:
- Authors:
- NetFPGA base source:
We present a DFA -based Reg ular Ex pression ( RegEx ) matching engine that takes advantage of the architectural features of FPGAs to perform fast pattern matching.
a) Store the DFA states along with their transitions in a compact fashion by eliminating redundant information.
b) Organize the DFA states in parallel on-chip memory banks to ensure the state lookup and transition to be completed in a constant time.
c) Pipeline the design to further improve the system throughput.
We focus on the Perl Compatible Regular Expressions (PCRE) matching and assume that a header matching unit is available to determine which PCRE patterns the packet should be matched against.
We first use a regular expression compiler to convert Snort rules (the PCRE portion) to DFAs. We divide the PCREs into groups of N-PCREs where N can be 1 through a reasonable number k depending on the header classification. The generated DFAs are analyzed to determine whether a DFA state should be stored in on-chip memory banks or off-chip ones. The memory images are generated subsequently to initialize the memory components in the system.
The information of a DFA state and its transitions are encoded and stored in M on-chip memory banks, if the number of transitions is less than a threshold (six in our design). The on-chip memory banks are read in parallel and the input byte from the packet is compared against the encoded DFA state to determine the next state. The DFA states with larger number of transitions than the threshold are placed in off-chip memory units and the lookup of the next state is simply indexing the memory with the input byte.
Download from Here
We have designed a regular expression compiler to generate DFAs from Perl Compatible Regular Expressions (PCRE). We use the rules from the Snort rule set as our test rules. Snort rules defines both the 5-tuple headers to match incoming packets and the pcre patterns to search for when a packet matches the 5-tuple header. In our study, we focus on the pcre matching and assume that a header matching unit is available to determine which pcre patterns the packet should be matched against.
- Download project bitfile:
nf2_download regex.bit
The latest version Verilog code supports all PCRE features: anchor mark, dollar mark, multiple lines mode, word boundary, back reference and look ahead.
sendUpdate.c : C source code of sending packets containing DFA image data.
rawEthernet.c : C source code of sending packets containing input stream to test the matching engine.
PCRE Rule: \x2FAD\x2FU?CMD/smi
Input stream: AD/AD/CMD /AD/UCMD /AD/UUCMD Testing results: Match Match No Match
PCRE Rule: ^[0-2]{1,3}\x00
Input stream: 00^@ 111^@ 22^@ ^@00^@ Testing results: Match Match Match No Match
Comments: The symbol "^@" in the input stream represents the character of 0x00.
C. Hayes and Y. Luo. Dpico: A high speed deep packet inspection engine using compact finite automata. In ACM Symposium on Architecture for Network and Communication Systems, Orlando, FL, December 2007.