Modern telecommunication networks are almost exclusively digital data networks. Practically all of these digital networks make use of some form of packet or block type data format to dynamically route data packets or blocks through the network. The advantages of dynamically routing packets or blocks of data through a network are well known. Advances in fiber optic transmission, protocols and modulation schemes also make possible very high-speed data transfer rates in these modern networks. However, these high-speed data transfer rates and attendant high-speed packet rates can create communication bottlenecks at the router or switch junction between high-speed data links. One cause of communication bottlenecks and increased router latency is the finite processing power available at the network router to determine where to route or "switch" an incoming data packet.
Consider, for example, the typical packet-based communications network where each packet of data contains address fields at the beginning of each packet (e.g., the "header" of the packet, defining, inter alia, a source and/or destination address of the packet). In most promulgated standards for packet type transmission, these address fields are long binary numerals; e.g., in the IEEE 802.3 Ethernet standard each address field is 48 bits. Long addresses are necessary to accommodate potentially very large networks with many individual addressees. This type of packet data is typically found in the Network layer (layer 3) of the Open Systems Interconnect (OSI) network model. However, some OSI layer 2 protocols also have sufficient address information to facilitate high-speed routing.
One illustrative example of an appropriate application of very high-speed packet routing involves certain protocol inefficiencies that may arise when supporting a layer 3 protocol on or within a layer 2 protocol. For example, one design consideration in expanding a Local Area Network (LAN) involves the data rate trade-offs that arise from the way the Carrier Sense Multiple Access with Collision Detection (CSMA/CD) method (an OSI layer 2 protocol) resolves carrier collisions. A CSMA/CD LAN can support many users. However, as overall bandwidth demand increases, the time-out or wait state caused by carrier "collisions" can greatly increase the latency of the network. For this reason, LAN extensions often use a bridge in the data link layer (OSI layer 2) to segment a LAN and isolate high-traffic segments to prevent performance degradations across the network.
In the ideal case, a bridge should make a "real-time" routing decision, (that is, decide to which port a given packet should be routed) within the packet transmission interval. The packet transmission interval is the time it takes to communicate a packet to the router or bridge. The routing decision may be accomplished by searching a station list stored in the bridge for the packet's destination address. Additionally, a search and qualified modification of the station list for the packet's source address allows the bridge to "learn" the network. For example, if a packet's source address is not in the station list, the "new" source address and the port associated with the new source address can be added to the station list. Thus, when the author of a packet subsequently becomes a recipient of a packet the station list will contain the appropriate entry.
In order to prevent excessive packet delay in the bridge, these two searches should be performed within a packet transmission interval. Thus, the searches should be accomplished very quickly. The packet transmission interval, and hence the routing decision period, may be very small in high-data rate networks such as those employing 100 Base T and 1000 Base-cx Ethernet protocols.
The searching of a station list for source and destination addresses and routing packets to the right port(s) may be referred to as "Network Address Filtering," or NAF.
There are several conventional search means used for network address filtering including hashing algorithms, binary searches, and CAMs (Content Addressable Memories).
Hashing algorithms and binary searches can be accomplished via software or with hardware-based accelerators. However, both of these methods have technical and practical limitations.
Hashing algorithm methods may be cost-effective, but slow, relative to the time allotted for searching, especially for a sparsely populated list. For example, the 48 bit IEEE 802.x address formats provide an address space of 2.sup.48 or 256 trillion (2.56.times.10.sup.14) possible addresses. A typical network application, however, may use only a few thousand addresses, thus creating a sparsely populated list. The hashing algorithm is a very inefficient means for searching a sparsely populated list. The frequent occurrence of a sparsely populated list in practical applications makes the hashing algorithm an inefficient method for NAF.
Station lists characteristically contain addresses in the order of thousands. Consequently, binary searches can be done somewhat efficiently (e.g., 12 clock cycles for a 4000 or 2.sup.12 entry list). However, the binary search algorithm has overhead processing costs that create practical implementation difficulties. Binary searching of station lists may, thus, increase the risk of occasionally large network routing delays. For example, a binary search requires a sorted list (i.e. where the addresses, are sorted in ascending or descending order) before a binary search algorithm can be used. Thus, when a new address entry occurs in binary searching using a sorted list, the list must be resorted before a routing decision can be made by the binary search method. Sorting the list may be a time-consuming operation that delays network routing decisions. Because network protocol standards support such large networks, network "events" (such as port failures, backbone failures, power outages and the like) may occur more frequently than in smaller networks. Each time a network "event" occurs, the network routing table (or station list) may be updated with new routing information. In large networks, new list entries or deletions may happen more frequently as a function of the number of network "events", or percentage of the availability of network resources, and the problem of having to re-sort the list can arise more often. In general, network planners and engineers seek to avoid this risk.
A further disadvantage to the hashing and binary search algorithms is an indeterminate search time. That is, these algorithms have a search time that varies with each network transaction depending on a given entry's position in the sorted list. Thus, giving these algorithms a factor beyond the engineer's absolute control. Furthermore, to determine the absence of a given entry in this list, the entire list must be searched.
The highest performance searching is accomplished using a CAM. Unfortunately, to date, conventional CAMs have been a relatively expensive form of memory on a cost-per-bit basis.
A discussion on CAMS in telecommunication networks can be organized into three categories: (1) CAMS, (2) CAM applications and (3) Network Address Filtering.
A CAM differs from a RAM in that a CAM adds a comparison logic function to every memory cell. This added functionality raises the component count in each cell by the number of transistors or other components/circuit elements needed to perform the comparison function, but adds a "parallel processing" characteristic to the CAM memory array.
Conventional CAMs used in NAF applications typically use 9-or 10- transistor static cells. The conventional CAM cell comprises a Static Random Access Memory (SRAM) and an Exclusive Negative OR (XNOR) logic function. A six(6)- transistor SRAM cell with a four(4)- transistor XNOR Gate is shown in FIG. 1.
TABLE 1 ______________________________________ Static CAM Truth Table Data (BL) Comparand (C) Match Line ______________________________________ 0 0 1 0 1 0 1 0 0 1 1 1 ______________________________________
It is assumed that a load impedance external to the cell is used to pull up each Match line output. Note from FIG. 1 that Q.sub.4 and Q.sub.8 are P-Channel MOSFETs in a conventional six-transistor SRAM cell. All other transistors are N-Channel. The conventional CAM cell truth table is shown in Table 1.
This conventional CAM cell occupies an area roughly twice that of an SRAM cell. In addition, a conventional CAM function requires a priority encoder which is connected to all of the Match lines, e.g., one for each word in the CAM. The priority encoder is used to prioritize multiple matches and to compute and/or output a single Match address.
In general a CAM achieves its search performance by simultaneously comparing all entries stored in the memory with an externally applied "comparand." Words in the CAM which "match" the comparand result in a Match line HIGH (true) while all words that contain even a single bit that does not match the corresponding comparand bit result in a Match line LOW (false).
The line marked "Mask VSS" in FIG. 1 can be used as a "global" bit mask such that when MASK is asserted (HIGH, true or logic "1"), the corresponding bit position in every word in the CAM is eliminated from the compare function (i.e., it becomes a global "don't care" (forced match) logic value for every word in the CAM). Such global masking is useful in comparing or searching for ranges of entries.
In network systems handling multiple protocols and/or formats, such as bridges and routers, local bit masking is also useful in facilitating routing decisions. Locally disqualifying a given bit from participating in comparisons is expensive and difficult in conventional static CAMs because the storage element, an SRAM cell, is typically only able to store a "1" or a "0." Thus, static CAM cell area would have to nearly double to store a third, "masked" state. Placing a third state in memory read applications also requires a more complex implementation of the memory sense amplifier.
The work of Wade and Sodini as discussed in A Ternary Content Addressable Search Engine demonstrates a five(5)transistor Dynamic CAM cell (hereinafter called the "Wade cell") that stores a "mask" logic state. The Wade cell, however, has a poor signal-to-noise ratio, making it difficult to use in practical applications. This poor signal-to-noise ratio is caused in part by the direct coupling of the Match line output to the storage elements of the cell. The direct coupling of the output to the storage element can corrupt the value stored in the storage element when the Match line is read. Additional circuit elements may be required to decouple the output line from the storage element to make practical implementations possible with the Wade cell. However, the additional circuit components would greatly reduce the benefits (primarily a low transistor count) the Wade cell enjoys.
CAM cell density can be increased and, thus, the cost decreased, by using a dynamic memory storage element because such memory elements typically employ fewer circuit components. One such Dynamic CAM (DCAM) cell is discussed in U.S. Pat. No. 4,791,606, co-invented by Mr. Bruce Threewitt, the inventor of the present invention, and assigned to MUSIC Semiconductors, Inc. This conventional DCAM cell is shown in FIG. 2. Table 2 shows the logic functions of the DCAM cell. Note that with the exception of the complimentary Comparand input(/C), the logic function defined in Table 2 (for the DCAM cell) is the same logic function defined in Table 1 (for the static CAM cell).
TABLE 2 ______________________________________ DCAM Cell Logic Function Data C /C Match Line ______________________________________ 0 0 1 1 0 1 0 0 1 0 1 0 1 1 0 1 ______________________________________
This DCAM cell is based on a common one-transistor, one-capacitor DRAM storage element comprising Q.sub.1 and C.sub.1, as shown in FIG. 2. The XNOR function is accomplished by Q.sub.2, Q.sub.3, Q.sub.4 and Q.sub.5, plus an implied external load impedance on the Match line. Q.sub.4 and Q.sub.5 are P-Channel MOSFETs and are arranged one above the other to consolidate corresponding gates and the N-well isolation region into a convenient shape for processing (i.e., a vertical stripe). Further, this cell can be adjacent to a mirror-image cell such that the N-well vertical stripe can be shared by two columns of cells alternated with two columns which share a P-well vertical stripe. This feature allows relatively denser packing than the static CAM cell of FIG. 1. However, this cell is single-ended and cannot store more than two possible logic states (i.e., cannot store a local "masked" logic state). A more complicated DRAM manufacturing process is also required to make the cell because of the P-Channel MOSFETS.