Pattern matching to decide whether a given character string matches one of a desired set of character strings may utilize a regular expression that represents a set of character strings by use of normal characters and/or meta characters. A determination as to the existence of match between the given character string and a regular-expression-based character string (i.e., regular expression pattern) makes it possible to check whether the given character string matches one of the desired set of character strings.
A method for performing regular-expression matching on a circuit by setting a regular expression pattern in a RAM is known in the art. In this method, a single pattern circuit is used to represent part of a regular expression pattern. A plurality of pattern circuits are series-connected to form a pattern circuit line that is capable of performing complex regular-expression matching. For example, the following regular-expression pattern may be used for matching.[AB]+.{1,3}[BC]?.*[^0]  (1)In this case, a first pattern circuit is assigned to “[AB]+”, second through fourth pattern circuits assigned to “.{1,3}”, a fifth pattern circuit assigned to “[BC]?”, a sixth pattern circuit assigned to “.*”, and a seventh pattern circuit assigned to “[^0]”. The first through seventh pattern circuits are connected in series to form a pattern circuit line. The characters of a character string to be matched are successively input into the pattern circuit line at the first-pattern-circuit end of the line. Each circuit matches in each cycle a character supplied thereto against the portion of the regular expression pattern assigned thereto. The first pattern circuit matches a character currently supplied thereto against part of the regular expression pattern, followed by sending this character and the result of the matching to the pattern circuit situated at the next stage. Any given circuit that is one of the next and subsequent pattern circuits matches a character currently supplied thereto against part of the regular expression pattern, and generates, based on the result of the current matching and the result of matching supplied from the preceding stage, a collective result of matching for the first stage through the stage of the given circuit, followed by sending this character and the collective result of matching to the pattern circuit situated at the next stage. The collective result of matching is set equal to a value indicative of a match upon the simultaneous occurrences of the condition that the result of matching supplied from the preceding stage indicates a match and the condition that the result of the current matching indicates a match. With this arrangement, the collective result of matching produced by the last-stage, seventh pattern circuit indicates a match in a certain cycle when a character string matching the regular expression pattern shown in the above-noted expression (1) is supplied as an input.
Pattern circuit lines may be provided in parallel to perform parallel processing, thereby simultaneously matching a plurality of data streams against different regular expression patterns, respectively. This arrangement can improve the speed of matching. For this kind of parallel processing, different types of pattern circuit lines as defined by respective, different numbers of series-connected pattern circuits are provided, and, also, a plurality of pattern circuit lines are provided for each type. For example, 6 size-“4” pattern circuit lines each comprised of 4 series-connected pattern circuits, 2 size-“8” pattern circuit lines each comprised of 8 series-connected pattern circuits, and one size-“12” pattern circuit line comprised of 12 series-connected pattern circuits may be provided. A matching core serves to write a regular expression pattern to a pattern circuit line and also to supply a character string to be matched to the pattern circuit line. A plurality of matching cores are provided for a plurality of data streams, respectively. One or more pattern circuit lines are then connected to one matching core. One pattern circuit line may be connected to only one matching core to perform matching in a dedicated fashion, or may be connected to a plurality of matching cores to perform matching in a shared manner. A pattern core line that is shared by a plurality of matching cores is subjected to exclusive control, such that the pattern core line performs matching for only one matching core at any given time.
The parallel configuration described above may be designed such that each matching core exclusively uses one or more pattern circuit lines. In such a case, a given circuit core is provided with dedicated pattern circuit lines of different sizes in order to perform matching against various regular expression patterns of different lengths. This configuration is fraught with circuit redundancy, resulting in an extremely large circuit size. The circuit design in which the matching cores share one or more pattern circuits may have a large number of connecting wires, and may have a poor degree of parallelism.
The problem of circuit redundancy noted above is in existence even when there is only one matching core. The fact that a single matching core performs matching with respect to various regular expression patterns having different lengths entails that pattern circuit lines of various different sizes are provided for this matching core. For example, different regular expression patterns may be provided in a first configuration that includes 8 size-“4” patterns and one size-“8” pattern or in a second configuration that includes one size-“4” pattern and 4 size-“8” patterns. In this case, the circuit that copes with both the first configuration and the second configuration ends up having 4 size-“8” pattern circuit lines and 5 size-“4” pattern circuit lines. It may be noted that this circuit can also cope with a size-“4” regular expression pattern in the first configuration by use of a size-“8” pattern circuit line. In this case, the number of pattern circuits is 52 (=32+20). In the first configuration, the number of pattern circuits simply calculated by ignoring pattern sizes is 40 (=8×4+1×8). In the second configuration, the number of pattern circuits simply calculated by ignoring pattern sizes is 36 (=1×4+4×8). Accordingly, only 40 pattern circuits are used at the maximum. Despite this fact, the circuit that can cope with both the first configuration and the second configuration ends up having 52 pattern circuits. Such a significant circuit redundancy results in an extremely large circuit size.    [Non-Patent Document 1] Yusaku Kaneta, Shingo Yoshizawa, Shin-ichi Minato, Hiroki Arimura, Yoshikazu Miyanaga, “Dynamic Reconfigurable Bit-Parallel Architecture for Large-Scale Regular Expression Matching,” Proc. of the 2010 International Conference on Field-Programmable Technology (FPT'10), pp. 21-28, December 2010.