Internet has provided diversified services in recent years. In addition to providing a search routing table for fulfilling the function of transferring data packets, a modern internet switch/router also provides the function of a virtual private network to allow secured data processing and establish a firewall for protecting the security in the network. Differentiated service can also be accomplished by providing different levels of quality service based on the result of packet classification. Furthermore, a layer four switch/router can direct packet data to backend servers in order to achieve the goal of load balancing. All these functions rely on the result of packet classification which is a vital technique to providing the above services.
A range search is a commonly used technique in searching for data. In the packet classification for TCP/IP network protocol, it is necessary to analyze the header of a packet in order to identify which data flow the packet belongs to. In general, the 32 bit source address, 32 bit destination address, 8 bit communication protocol, 16 bit source port number and 16 bit destination port number of the internet protocol are used for searching in the rule database.
A rule database usually allows some flexibility for an administrator to set up the rules. When the administrator establishes the rules, they may include don't care or range rules. For a range rule, a packet satisfying the rule must have an associated data value between the start value and the end value of the rule. FIG. 1 shows a table for setting up rules which uses a range search for packet classification.
After the administrator sets up the rules, a rule table is constructed for the rules and their associated data. FIG. 2 illustrates a rule table 201 that comprises a database of five 8 bit rules including rule #0, rule #2, . . . and rule #4. A rule has a data record in the database. In FIG. 2, each rule has an 8 bit data and each bit can be ‘0’, ‘1’, or ‘X’ (don't care). In addition, a rule that has a front order is usually given a higher priority. The construction of the table is dependent on the method used for the search algorithms.
With a rule table constructed, an input data is used as an input key to search for the rule that the input data can satisfy. As shown in FIG. 2, rules #1, #3 and #4 are the search results of input keys #0, #1 an #2 respectively. The rule table is used to find the data record that matches the input key. If more than one rule are satisfied with the input key, the lookup result is the rule that has the most front order and hence the highest priority.
FIG. 3 illustrates a straightforward method of a conventional range search. Assume there are n rules in the rule table. FIG. 3 shows an example in which n=8 and there are rule #0, rule #1, . . . , rule #7. The value of the input data serves as the input key. Eight identical comparator circuits in parallel receive the input key. Each rule has a corresponding comparator circuit to determine if the input key is within the data range of the rule. If the value of the input key is greater than or equal to the start value of the rule and less than or equal to the end value of the rule, the input key satisfies the rule and the output of the comparator circuit is 1. Otherwise, the output of the comparator circuit is 0. When there are multiple rules that are satisfied, a priority encoder is then used to find the highest priority rule among all the rules that are satisfied with the input key. This highest priority rule is the lookup result.
The straightforward method shown in FIG. 3 is equivalent to a linear search. Multiple comparator circuits are used in parallel in order to increase the speed of the linear search. The drawback of this method is that the comparator circuit has a long delay. When the number of bits in the data increases, the complexity of the comparator circuit also increases. Furthermore, when the number of rules increases, the circuit becomes too large to be implemented.
FIG. 4 illustrates another range search method. Assume there are n rules. The range of the n rules divides an input data into 2n+1 sections at most. Each section has a corresponding bit map having n bits. In the bit map, a bit 1 or 0 represents whether an input value satisfies or dissatisfies the corresponding rule. With reference to FIG. 4, section X5 is used as an example. If the input key value falls into section X5, both rule #1 and rule #2 are satisfied. Therefore, the bits corresponding to section X5 in the bit map have values ‘1’, ‘1’ and ‘0’. The table composed by the bitmaps is called rule mapping table. As shown in FIG. 4, the number of rules is 3 and the input data are divided into X1˜X7 7 sections. Each section has a corresponding bitmap with 3 bits.
The method shown in FIG. 4 sets up the rule mapping table in advance. When the table is looked up, the value of the input data is used as the input key to perform a binary search. From the section into which the input key falls, the corresponding bitmap of the section can tell which rules are satisfied. The drawback of this method is that the rule mapping table is too large. If there are n rules, each bitmap has n bits. The total number of bits in the rule mapping table is n×(2n+1) that is proportional to the square of the number of the rules.
When the number of rules increases, the method illustrated in FIG. 4 requires a large amount of memory space. For example, 2M bits=256K bytes of memory are required for a rule mapping table to cover 1024 rules. In addition, an index table is required to record the boundary values of the 2n+1 sections in order for the binary search to find which section an input key value falls into. For 16 bit data with 1024 rules, the index table of this method is about 16×2×1024=32K bits=4K bytes. In terms of the search speed, the number of searches in the binary search is 1+log2 n. In the hardware, it takes two clock cycles for every read from the table. Therefore, 2×(1+log2 n) clock cycles are required to obtain the search result. If there are 1024 rules, it takes 11 search or 22 clock cycles.
Conventionally, data search can be accomplished by using content addressable memory (CAM). Take the 8×5 rule table shown in FIG. 2 as an example. The architecture using content addressable memory is shown in FIGS. 5a and 5b. As shown in FIG. 5a, the content addressable memory uses one rule register and one mask register to represent one rule in the rule table. Because the don't care bits in the rule are not used, the value in the mask register represents the bits that have to be compared in the rule. The value in the rule register represents if the bit for comparison in the rule is ‘1’ or ‘0’. For example, rule #0 with a value 101×1×11 has a corresponding mask register ‘11101011’. If the don't care bits in the rule are set to ‘0’ and other bits are set according to the bit values in the rule, the value of the corresponding rule register for rule #0 is ‘10101011’.
With reference to FIG. 5b, in the architecture of using the content addressable memory for a range search, the input key value is ANDed with the values in the mask registers to extract the bits that require comparison. These bits are then compared with the values in the rule registers to determine output values that are either 1 or 0. Finally, a priority encoder is used to find the highest priority rule among all the rules that are satisfied with the input key value if more than one rule are satisfied.
Although the content addressable memory can be used to implement a fast and efficient range search, the cost of hardware is very high. Some cost may be saved because there are don't care bits in the rule table implemented by the content addressable memory. If the data range is continuous and the range value, i.e., the difference between the end value and the start value, is a sum of multiple powers of 2, the rule can be represented by a single entry in the rule table. For example, a given rule with a data range from 152 to 159 has a binary representation from 10011000 to 10011111. The range value is 20+21+22=7, which can be represented by a single rule table entry 10011xxx. In a different example which has a data range from 131 to 187 with a binary representation 10000011 to 101110111, the range value is 23+24+25=56 which can also be represented by a single rule table entry 10xxx011.
However, if the range value is not a sum of multiple powers of 2, the rule has to be represented by multiple rule table entries. For example, a given rule with a data range from 150 to 160 is represented in binary by 11010100 to 10100000. At least three rule table entries including a data range from 150 to 151 represented by 1001011x, a data range from 152 to 159 represented by 10011xxx, and a date value 160 represented by 10100000. Another rule with a data range from 140 to 187, which is represented by 10000010 to 10111011, requires at least two rule table entries including a data value 130 represented by 10000010 and a data range from 131 to 187 represented by 10xxx011. As a result, the cost of memory is even more expensive if the range value is not a sum of multiple powers of 2.
From the above discussion, it can be seen that the conventional methods of constructing a rule table and performing a range search have the drawbacks that the rule table is too large and the number of searches is too many. When the number of rules gets larger, many of the conventional methods become infeasible or impractical. There is a strong demand in a range search method and architecture that can save memory hardware and reduce the number of searches.