The present invention relates to associative memory devices and methods, and more particularly, to content addressable memory (CAM) devices capable of partitioned operation and methods of operation thereof.
High-speed, high-volume address lookup operations are a common requirement in data communications applications, such as Internet routing. For example, as channel line rates in core networks increase from OC-48 to OC-768, core routers will generally need to process hundreds of millions of packets per second. This creates the need for very fast searches of Level 2 (L2) & Level 3 (L3) forwarding tables. In Access & Edge networks, the requirement for fast search operations also is significant. While line rates may be lower (e.g., OC-48 and below), increasing “network intelligence” means that each packet typically requires several database searches. This can include operations for packet forwarding, quality of service (QoS) classification, access control and security.
So consequently there is a need for large longest prefix matching (LPM) address forwarding databases that can be quickly searched. For example, new routers may need to support more than 1 million LPM addresses in their databases, but still have to search this database very fast. As line rates go from OC-48 (2.5 Gbps) to OC-768 (10 Gbps), routers may have to perform more than 125M lookups per second.
Hierarchical search algorithms, e.g., “trie” algorithms, implemented in random access memory (RAM) have been used for LPM. For example, in routing applications based on Internet Protocol version 4 (IPv4), a user typically looks at a first portion of an address to index to another node in the search tree. A second portion of the address indexes the entries in this node. The trie algorithm typically repeats until a null entry is reached, returning a best-matched entry as an LPM result.
FIG. 1 illustrates such an algorithm. The portions of an address of overall width W looked up each time are called “strides.” Speed and memory storage complexity generally depends on the length k of the strides. More strides generally mean more lookups and, therefore, less lookup speed. However, reduced stride width generally means less memory is required, as nodes can be smaller. Typical stride configurations for IPv4 lookups are 8-8-8-8 and 16-4-4-4-4. Variations, like 16-8-8, are also used. The stride factor defines the memory usage, e.g., for 8-8-8-8, 8 bits of stride require a 256 entry node allocation at each address addition.
Content addressable memory (CAM) is also commonly used for address lookup operations. CAM cells are frequently configured as binary CAM cells that store only data bits (as “1” or “0” logic values) or as ternary CAM (TCAM) cells that store data bits and mask bits. As will be understood by those skilled in the art, when a mask bit within a ternary CAM cell is inactive (e.g., set to a logic 1 value), the ternary CAM cell may operate as a conventional binary CAM cell storing an “unmasked” data bit. When the mask bit is active (e.g., set to a logic 0 value), the ternary CAM cell is treated as storing a “don't care” (X) value, which means that all compare operations performed on the actively masked ternary CAM cell will result in a cell match condition. Thus, if a logic 0 data bit is applied to a ternary CAM cell storing an active mask bit and a logic 1 data bit, the compare operation will indicate a cell match condition. A cell match condition will also be indicated if a logic 1 data bit is applied to a ternary CAM cell storing an active mask bit and a logic 0 data bit. Accordingly, if a data word of length N, where N is an integer, is applied to a TCAM array block having a plurality of entries therein of logical width N, then a compare operation will yield one or more match conditions whenever all the unmasked data bits of an entry in the TCAM array block are identical to the corresponding data bits of the applied search word. This means that if the applied search word equals {1011}, the following entries will result in a match condition in a TCAM: {1011}, {X011}, {1X11}, {10X1}, {101X}, {XX11}, {1XX1}, . . . , {1XXX}, {XXXX}.
CAM is generally well suited for LPM. When a CAM is used for address lookup, entries can be searched in parallel, and a matching entry can be found in one instruction. Address table maintenance in a CAM can also be efficient, as it may take only one write instruction to add one entry to the table, and one instruction to delete an entry to the table.
As fabrication technology improves, bigger and bigger CAMs can be manufactured. However, increasing CAM size can lead to power problems, as a typical CAM pre-charges all of the entries therein for a search. If a device, such as a router, uses multiple CAMs, total power dissipation can be undesirably high.
Techniques for reducing power consumption in large CAM arrays have been proposed. For example, “Reducing TCAM Power Consumption and Increasing Throughput,” by Panigrahy et al., Hot Interconnects 2002, describes distributing address entries across a plurality of TCAM chips based on prefix ranges of the entries, and using a pruned search technique based on the prefix ranges such that respective chips are searched for addresses having prefixes in respective ranges. In this manner, power consumption can be reduced. Such a prefix mapping can also be used to provide a higher number of lookups for a given prefix range than supported by a single TCAM chip.
U.S. Pat. No. 6,324,087 to Pereira describes a CAM device having a plurality of CAM blocks and partitioned into a number of individually searchable partitions, wherein each partition may include one or more CAM blocks. During compare operations between a comparand word and data stored in the CAM device, a search code is provides to block select circuits, which selectively enable or disable their corresponding CAM blocks. The search code may be provided separate from a comparand word supplied to the CAM blocks, or as part of the comparand word.
U.S. Pat. No. 6,542,391 to Pereira et al. describes a CAM device having a plurality of CAM blocks and a block selection circuit. The block selection circuit includes an input to receive a class code and circuitry to output a plurality of select signals to the plurality of CAM blocks. Each of the select signals selectively disables a respective one of the CAM blocks from participating in a compare operation according to whether the class code matches a class assignment of the CAM block.
U.S. Pat. No. 6,538,911 to Allan et al. describes a CAM with a block select for power management. The CAM device includes a search port that is in communication with a plurality of memory blocks and that is capable of facilitating search operations using the memory blocks. A block select bus is capable of selecting at least one specific memory block, such that search operations are performed using only the selected memory blocks.