1. Field of the Invention
The present invention relates generally to computer networks, and more specifically, to a method and apparatus for configuring an associative memory device to efficiently perform matches against long input strings, such as network messages.
2. Background Information
A computer network typically comprises a plurality of interconnected entities that is transmit (i.e., “source”) or receive (i.e., “sink”) data frames. A common type of computer network is a local area network (“LAN”) which typically refers to a privately owned network within a single building or campus. LANs employ a data communication protocol (LAN standard), such as Ethernet, FDDI or Token Ring, that defines the functions performed by the data link and physical layers of a communications architecture (i.e., a protocol stack), such as the Open Systems Interconnection (OSI) Reference Model. In many instances, multiple LANs may be interconnected by network links to form a wide area network (“WAN”), metropolitan area network (“MAN”) or intranet. These LANs and/or WANs, moreover, may be coupled through one or more gateways to the well-known Internet.
Each network entity preferably includes network communication software, which may operate in accordance with the Transmission Control Protocol/Internet Protocol (TCP/IP) suite of communication protocols. TCP/IP basically consists of a set of rules defining how entities interact with each other. In particular, TCP/IP defines a series of communication layers, including a transport layer and a network layer. At the transport layer, TCP/IP includes both the User Datagram Protocol (UDP), which is a connectionless transport protocol, and TCP which is a reliable, connection-oriented transport protocol. When a process at one network entity wishes to communicate with another entity, it formulates one or more network messages and passes them to the upper layer of the TCP/IP communication stack. These messages are passed down through each layer of the stack where they are encapsulated into segments, packets and frames. Each layer also adds information in the form of a header to the messages. The frames are then transmitted over the network links as bits. At the destination entity, the bits are reassembled and passed up the layers of the destination entity's communication stack. At each layer, the corresponding message headers are stripped off, thereby recovering the original network message which is handed to the receiving process.
One or more intermediate network devices are often used to couple LANs together and allow the corresponding entities to exchange information. For example, a bridge may be used to provide a “bridging” function between two or more LANs. Alternatively, a switch may be utilized to provide a “switching” function for transferring information, such as data frames or packets, among entities of a computer network. Typically, the switch is a computer having a plurality of ports that couple the switch to several LANs and to other switches. The switching function includes receiving network messages at a source port and transferring them to at least one destination port for receipt by another entity. Switches may operate at various levels of the communication stack. For example, a switch may operate at layer 2, which, in the OSI Reference Model, is called the data link layer and includes both the Logical Link Control (LLC) and Media Access Control (MAC) sub-layers.
Other intermediate devices, commonly referred to as routers, may operate at higher communication layers, such as layer 3, which in TCP/IP networks corresponds to the Internet Protocol (IP) layer. IP message packets include a corresponding header which contains an IP source address and an IP destination address. Routers or layer 3 switches may re-assemble or convert received data frames from one LAN standard (e.g., Ethernet) to another (e.g. Token Ring). Thus, layer 3 devices are often used to interconnect dissimilar subnetworks. Some layer 3 devices may also examine the transport layer headers of received messages to identify the corresponding TCP or UDP port numbers being utilized by the corresponding network entities. Such extended-capability devices are often referred to as Layer 4, Layer 5, Layer 6 or Layer 7 switches or as Network Appliances.
Access Control Lists
Some networking software, including the Internetwork Operating System (IOS®) from Cisco Systems, Inc. of San Jose, Calif., supports the creation of access control lists or filters. These access control lists are typically used to prevent certain traffic from entering or exiting a network. In particular, a layer 3 device may utilize an access control list to decide whether a received message should be forwarded or filtered (i.e., dropped) based on certain predefined criteria. The criteria may be IP source address, IP destination address, or upper-layer application based on TCP/UDP port numbers. Many applications are assigned specific, fixed TCP and/or UDP port numbers in accordance with Request for Comments (RFC) 1700. For example, TCP/UDP port number 80 corresponds to the hyper text transport protocol (HTTP), while port number 21 corresponds to file transfer protocol (ftp) service. An access control list may thus allow e-mail to be forwarded, but cause all Telnet traffic to be dropped. Access control lists may be established for both inbound and outbound traffic and are most commonly configured at border devices (i.e., gateways or firewalls).
To generate an access control list, a network administrator typically defines a sequence of statements using a conventional text editor or graphical user interface (GUI). The statements typically recite some criteria of interest, e.g., IP addresses, port numbers, etc. As each subsequent statement is defined, it is appended to the end of the list. The completed list is then downloaded to the desired layer 3 device where it may be stored in the device's non-volatile RAM (NVRAM) typically as a linked list. Upon initialization, the device copies the access control list to its dynamic memory. When a packet is subsequently received at a given interface of the device, a software module of IOS® tests the received packet against each criteria statement in the list. That is, the statements are checked in the order presented by the list. Once a match is found, the corresponding decision or action (e.g., permit or deny) is returned and applied to the packet. In other words, following the first match, no more criteria statements are checked. Accordingly, at the end of each access control list a “deny all traffic” statement is often added. Thus, if a given packet does not match any of the criteria statements, the packet will be discarded.
Most intermediate network devices employ either centralized or distributed classification engines. With a centralized classification engine, both the processor executing the program instructions and the memory storing the actions are typically located on a single supervisor card disposed within the network device. All network messages received by the network device are sent to the supervisor card for processing by the classification engine. With a distributed architecture, the classification engine is replicated across a plurality of the device's line cards. For example, each line card, which has a plurality of ports, has its own classification engine for processing the network messages received and/or to be forwarded from those ports. The centralized architecture minimizes resources by requiring only a single store of the ACLs and the actions to be applied to the network messages. A centralized architecture, however, may produce a bottleneck reducing the device's performance as it must process the network messages from all of the device's ports. The distributed architecture generally improves performance because the classification process is spread across a plurality of engines. However, the distributed architectures require a replication of components across multiple line cards, thereby increasing the cost of the device.
As indicated above, access control lists are used primarily to provide security. Thus, for a given interface, only a single list is evaluated per direction. The lists, moreover, are relatively short. Nevertheless, the evaluation of such lists by software modules can significantly degrade the intermediate device's performance (e.g., number of packets processed per second). This degradation in performance has been accepted mainly due to a lack of acceptable alternatives. It is proposed, however, to expand the use of access control lists for additional features besides just security decisions. For example, access control lists may also be used to determine whether a given packet should be encrypted and/or whether a particular quality of service (QoS) treatment should be applied. Accordingly, it is anticipated that multiple access control lists may be assigned to a single interface. As additional access control lists are defined and evaluated per packet, the reduction in performance will likely reach unacceptable levels.
To improve performance, some devices store access control lists in an associative memory, such as a ternary content addressable memory (TCAM). Many TCAM suppliers currently make TCAMs up to 144 bits in width. This has proven acceptable because the total number of bits being evaluated is on the order of 133. In particular, the message fields currently being evaluated by access control lists (i.e., the criteria) include IP source address, IP destination address, protocol, TCP/UDP source port, TCP/UDP destination port, virtual local area network (VLAN) identifier, differentiated services codepoint (DSCP), and the physical port on which the message was received. With version 4 of the Internet Protocol (IPv4), source and destination addresses are 32 bits in length. Accordingly, the above information, typically referred to as the flow label, adds up to approximately 133 bits, which is less than the width of many commercially available TCAMs.
With version 6 of the Internet Protocol (IPv6), however, network layer addresses are now 128 bits long. Assuming the same fields are to be evaluated, the flow labels being evaluated are now approximately 336 bits long, which is more than twice the size of many current TCAMs. It is also proposed to evaluate higher-level messages, e.g., up to layer 7, which is the application layer. This would further increase the amount of information, and thus the number of bits, being evaluated.
In addition, TCAMs often require more power, are more expensive and are often slower than synchronous Random Access Memory (SRAM). Accordingly, as the speed of network links increases and intermediate devices are called upon to perform more and more processing of each packet, designers look to multiple SRAMs to perform packet classification operations. Multiple SRAM approaches, however, consume large amounts of printed circuit board space and increase the pin count, driving up the cost and complexity of the designs. In order to process IPv4 addresses, for example, a design might employ six SRAMs. For IPv6 addresses, twenty or more SRAMs might be required. Such a large number of SRAM components would likely result in unacceptably high power requirements. It may also increase the mean time between failures of the intermediate device, and place high thermal stresses on the device's mechanical design.
Accordingly, a need exists for a mechanism that can search long strings of data (e.g., 366 bits or more) at relatively high-speed and can do so in an efficient manner.