Field
The present invention relates to a multi-pattern matching algorithm and a processing apparatus using the same, and more particularly to a multi-pattern matching algorithm using a direct filter and a compact table and a processing apparatus using the same.
Description of Related Art
A multi-pattern matching relates to how to find whether at least one pattern in a string exists or not. In the past, in order to solve a multi-pattern matching problem, the existence of the pattern was, as shown in FIG. 12, checked by searching a string for each pattern once. However, in this method, the string should be searched as many as the number of the patterns of which the existence is to be checked, so that the performance of the method becomes slower.
Therefore, for the purpose of overcoming such a problem, a multi-pattern matching algorithm has been researched which is capable of checking whether all of the patterns exist or not by only onetime string search, regardless of the number of the patterns.
In general, a single pattern matching algorithm has a time complexity of O (m+zn) for solving the multi-pattern matching problem (here, m: sum of the lengths of all of the patterns, z: the number of the patterns, n: the length of the string). Contrarily, Aho-Corasick algorithm that is one of conventional algorithms has a time complexity of O (m+n+k) (k: the number for which the pattern is formed in the string).
Referring to FIG. 13, in the Aho-Corasick algorithm uses a structure having a failure link and an output link added to a keyword tree including the patterns. Through use of this, the Aho-Corasick algorithm is able to determine whether all of the patterns in the keyword tree exist or not by only one time string search.
However, the Aho-Corasick algorithm has a problem that the size of a tree used for searching in the Aho-Corasick algorithm rapidly increases with the increase of the number of the patterns. Therefore, due to the features of the tree structure, a lot of cache misses occur during the searching by using the Aho-Corasick algorithm. Generally, a lot of cache misses are directly related to the performance degradation.
Accordingly, it is necessary to research a multi-pattern matching algorithm capable of reducing the occurrence of the cache miss and a processing apparatus using the same.