Network search engines (NSEs), which can include content addressable memory (CAM) devices such as ternary CAM (TCAM) devices, can be used in many applications related to data networks. As but one example, NSEs can be used to search databases such as Access Control Lists (ACLs). An ACL can present a set of rules that can limit access to a network (e.g., forwarding of packets) to only those packets having fields falling within a particular range.
The relatively rapid speed at which CAM devices can compare multiple entries to an applied search key value has resulted in CAMs enjoying widespread use in NSE devices.
A typical CAM device can include a large number of entries, each of a designated width. As but one example, a 20-megabit CAM device can have 256K entries, each having a width of 80-bits. CAM devices may include both binary CAM devices and ternary CAM (TCAM) devices. Binary CAM devices typically generate a match indication when all bits of a search key match all the bits of an entry. TCAM devices typically include entries having data bits that can be masked from a compare operation. Thus, a corresponding search key bit value can be said to match a corresponding entry bit when the two are the same, or when the entry bit is masked.
CAM devices can support a variety of NSE device functions, including “read” operations, “write” operations, and “search” operations. In a read operation, CAM entry value (data and/or mask value) can be output according to an applied read address. In write operation, a data value (and/or mask value) can be input in conjunction with a write address to store the data value at a CAM entry location. In a search operation, a search key can be applied to all CAM entries, and a highest priority CAM entry matching the search key can register a match (HIT) otherwise a MISS is registered. The result of a HIT is typically an index value, which can be the address of the matching CAM entry or some value generated therefrom. Priority of HIT results is typically determined according to CAM entry address. For example, in the event multiple CAM entries match a given search key, the CAM entry having the lowest address is given the highest priority, typically by operation of a priority encoder.
Another desirable operation of an NSE device can be a “learn” operation. In a learn operation, an input data value can be written to a next “free” CAM entry location. Such an operation can also include outputting the address of the CAM entry location. A free CAM entry is a CAM entry that does not currently store valid data, and hence is available for storing a valid data value. A next free CAM entry is the highest priority CAM entry that is free. Conversely, a “not-free” CAM entry is a CAM entry that is not available for storing a new data value.
The address corresponding to a next free CAM entry is referred to as a next free address (NFA).
The inclusion of a learn operation capability in an NSE device can be highly desirable, as it can eliminate the need for other system resources to keep track of which entries are free and which entries are not, and/or which of the free entries has a highest priority.
In one conventional arrangement for performing a learn operation, one or more bits in every CAM entry can indicate if the CAM entry is free or not-free (called the “entry-free” bit). When a learn operation is performed, a CAM entry is written to, and the corresponding entry-free bit is marked not-free. A search-next-free operation is then performed on the “entry-free” bits of each CAM entry to locate a next free entry.
To better understand various aspects of the present invention, a conventional approach to executing a learn operation with a CAM device will now be described with reference to FIG. 12.
A conventional CAM device 1200 can include a number of “superblocks”, one of which is shown as 1202. Each superblock 1202 can include a number of sub-blocks 1204, each of which can include a number of CAM rows 1206-0 to 1206-511. Learn operations are facilitated by sub-blocks 1204 that include CAM rows (1206-0 to 1206-511) that provide a match indication (M0 to M511) as well as a status indication (C0 to C511), or entry-free bit. Further, multiplexers (MUXs) 1208-0 to 1208-511 are provided to each row.
In a search operation, MUXs (1208-0 to 1208-511) can provide match indications to a priority encoder 1210 to thereby prioritize and encode a highest match result.
However, in a learn operation, MUXs (1208-0 to 1208-511) can provide status indications (C0 to C511) to a priority encoder 1210. Status indications can be the inverse of a valid bit for an entry. Thus, priority encoder will prioritize and encode a highest “invalid” (or free) entry. A data value can then be written to the address of such an entry.
In the very particular arrangement of FIG. 12, in a learn operation, learn results from each sub-block 1204 can be stored in a corresponding sub-block next free address (NFA) register 1212-0 to 1212-m. A value in one of the sub-block NFA registers (1212-0 to 1212-m) can be selected by a magnitude comparator 1214-n. Such a selected value can be stored in a super-block NFA register (1216-0 to 1216-p). A value in one of the super-block NFA registers (1216-0 to 1216-p) can be selected by a global priority encoder 1218. Such a value can then be stored in a global NFA register 1220.
A conventional CAM device 1200 can further include a write address MUX 1222. A write address MUX can have one input that receives an address from global NFA register 1220 and another input that receives an external write address. In a learn operation, the address value in global NFA register 1220 can be output by write address MUX 1222. In a “normal” write operation (not a learn operation), an externally applied write address can be output by write address MUX 1222.
Various aspects of such an arrangement are further detailed in U.S. Pat. No. 6,647,457 issued to Sywyk et al. on Nov. 11, 2003.
Other conventional approaches to providing CAM entry status information have included incorporating “shadow” registers to maintain a record of which entries are valid.
Conventional solutions like those described above can have some disadvantages. In many next generation applications, a higher throughput of operations (such as learns) is expected at the cost of higher latency. For example, a conventional approach can provide a throughput of 8 million learn operations per second with a latency of 3 to 4 cycles. However, future applications may require a throughput of 30 to 50 million learn operations per second with a latency of 20-40 cycles. Thus, a throughput of 30-50 million learns per second is desirable, even at the cost of higher search latencies.
In addition, some conventional solutions for providing a learn operation require that a “search-miss” event occur before the learn operation is performed. That is, only after a search key has been applied and none of the CAM entries match, can the learn operation be performed. Further, in such an arrangement, the write data for the learn operation is typically limited to the search key data. Such approaches are suitable for earlier generation CAM applications, such as those in which media access control (MAC) learning was a primary application. However, in future applications, such as reflexive ACLs (which can dynamically update entry values in response to particular incoming/outgoing data packet values), operations do not follow the search-miss then learn pattern. In particular, a learn operation can occur after a search hit result.
In light of the above, it would be desirable to arrive at some way of providing learn operations at a very high throughput rate, without necessarily providing very low latency values.
It would also be desirable to provide a search engine devices and methods that do not require a search-miss event before a learn operation can be performed.
It would also be desirable to provide a search engine device that can provide any of the above features, yet not unduly increase overall area needed for such a device.
It would also be desirable to provide search engine device that can provide any of the above features, yet remain relatively easy to implement and verify.