1. Technical Field
The present invention relates to database management in computer networks in general and, in particular, to managing said database in a manner that simplifies or condenses its size.
2. Prior Art
Broadly, a computer network may be viewed as a plurality of nodes interconnected by communications subsystems. The communications subsystems may include transmission link (such as a T1 line), local area network (LAN), wide area network (WAN), internet, etc. The nodes may include one or more devices such as switches, routers, bridges, network interface card (NIC), etc. Usually, NICs are components that are mounted in higher level devices such as a server, etc. As used in this document a node is deemed to be synonymous to one of these devices.
A switch is a network node that directs datagrams on the basis of Medium Access Control (MAC) addresses, that is, Layer 2 in the Open Systems Interconnection Basic Reference Model (OSI model) well known to those skilled in the art [see “The Basics Book of OSI and Network Management” by Motorola Codex from Addison-Wesley Publishing Company, Inc., 1993]. A switch can also be thought of as a multiport bridge, a bridge being a device that connects two LAN segments together and forwards packets of the basis of Layer 2 data. A router is a network node that directs datagrams on the basis of finding the longest prefix in a routing table of prefixes that matches the Internet Protocol (IP) destination addresses of a datagram, all within Layer 3 in the OSI model. A Network Interface Card (NIC) is a device that interfaces a network such as the Internet with an edge resource such as a server, cluster of servers, or server farm. A NIC might classify traffic in both directions for the purpose of fulfilling Service Level Agreements (SLAs) regarding Quality of Service (QoS). A NIC may also switch or route traffic in response to classification results and current congestion conditions. The present invention applies to a network node that can be a switch, a router, NIC, or, more generally, a machine capable of classifying packets and taking an action or actions (such as discarding the packet) based upon classification results.
A necessary component of the node is the database which is generated by a network administrator. The database may be used for a variety of purposes including filtering or network processing.
Network processing in general entails examining packets relative to the database and deciding what to do with them. Usually the action to be taken is part of or is recorded in the database. This examination can be costly in terms of processing cycles, and traffic can arrive irregularly over time. Consequently, to avoid backlogs, queuing latency and the danger of buffer overflow, network nodes in general must attempt to enforce security policies or other policies based upon classification as efficiently as possible.
The database is usually arranged as a matrix including a plurality of rows and a plurality of columns. Each row represents a rule in the database. The characters in the database matrix can be 0, 1 and * (Don't care or wildcard). Because the database is made out of only three character types it is often referred to as Ternary data structure. When the Ternary data structure is loaded in a Contents Address Memory (CAM) the combination (i.e. CAM and database is referred to as a Ternary Contents Address Memory (TCAM).
Information such as in a computer network packet can be given a key. Typically a key is a fixed binary expression that is the concatenation of bits from the standard header fields of the packet. A Ternary Content Addressable Memory (TCAM) includes rows that represent classifications or rules. The rows appear in an array (a matrix, in the present invention). Each row of the array includes logical tests matching bits in a key with 0, 1, and * (don't care or wildcard) entries For example, the key 0110 would fit the rule 01** since bits in the key match bits in the rule; of course, typical keys and rules would have many more than four bit positions. That is, the length of the row is the total number of entries and is constant (typically about 100 bit positions) for all rows. It is the number of columns in the array seen as a matrix. Each row points to an action (or possible a combination of actions) and a priority (to be used if one key can match multiple rows). An input key for a packet is derived from (perhaps equal to) a packet header field or the concatenation of packet header fields with the same length as the TCAM row length. The key represents the packet and is fed to the TCAM. A key is tested simultaneously for match with the corresponding 0, 1, and * entries in the row. If no rows fit, then a default action is taken (or an all * row is included with lowest priority). Else, of all the rows that do fit, the one with highest priority is selected and its action is enforced.
A 0, 1, * (Ternary) array logically identical to that searched by a TCAM can also be searched by numerous tree search methods. In tree search technology, a few bit positions are tested and, depending upon the location and relative frequency of 0, 1 entries versus * entries, the bit tests can eliminate from consideration all but one or a few rules or rows from consideration. That is, the bit tests can be used to show that the majority of rules cannot possibly fit a certain key, leaving a relatively simple test of the fall key by one remaining rule or a few remaining rules. U.S. Pat. No. 6,298,340 “System and method and computer program for filtering using tree structure” describes one such approach. An alternate approach, called the Balanced Routing Tables (BaRT) Algorithm, is described in U.S. patent application publication: US 2002/0002549 A1, Jan. 3, 2002. Other approaches are also set forth in J. van Lunteren, “Searching very large routing tables in wide embedded memory”, Proceedings IEEE Globecom, vol. 3, pp. 1615-1619, November 2001 and J. van Lunteren, “Searching Very Large Routing Tables In Fast SWAM,” IEEE International Conference on Computer Communications and Networks ICCCN 2001, Phoenix, Ariz., Oct. 15-17, 2001.) The cited references are included here as if in full.
Given an array of 0, 1, * entries and a key, a TCAM has the advantage of testing the key with all rules simultaneously and discovering all matches in only one processor cycle. However, the same key and array can be tested by tree approaches that can require smaller and cheaper hardware resources, perhaps one hundred times fewer transistors, to discover matches in tens of processor cycles. The optimal approach, be it TCAM, tree, or other, to finding which 0, 1, * rows of a ternary array fit a given key depends upon performance requirements.
One of the factors influencing performance is the size (number of rows and columns) of the ternary array. Any reduction in the number of rows and/or the number of columns has a positive effect on performance in that less storage is required and the search can be done in a much shorter time interval. Even though reducing the size of the ternary array is a desirable goal the prior art has not provided an apparatus and/or method (tool) that analyzes a ternary array and provides an array that is logically equivalent but smaller than the original array.
In view of the above there is a need for such a tool that is provided by the present invention.