1. Technical Field
The present invention relates generally to static information storage and retrieval systems, and more particularly to combining associative memories, which are also referred to as content or tag memories, with addressable memories.
2. Background Art
Many modern systems require searching for information at very high speeds; hence hardware-based approaches are often employed. An increasingly common example of this, and the one primarily used herein, is searching in communications networks (e.g., in switches and routers). An increasing popular aid in this type of search is the content addressable memory (CAM), also know as associative or tag memory. The use of CAM can either supplant or compliment more traditional algorithmic-based search approaches.
Briefly, the task we seek to perform here can be characterized as using a search key to retrieve a search result from a database. In particular, this task can be bifurcated, with CAM used for one part and regular, address accessed memory used for the other. There is considerable variation in CAM related terminology, so we try to consistently use the following terms herein. A table of lookup values stored in CAM is a “lookup table,” and its respective elements are “entries.” A table of associate content (AC) stored in regular memory is an “AC table,” and its respective elements are “records.” Collectively, the entries in the lookup table and the records in the AC table are the database. A “search key” can be matched against the entries in the lookup table to produce a “match addresses,” and this match address can then be used to retrieve a single record from the AC table. Specifically, that record becomes a “search result.”
Due to their parallel lookup nature, a CAM can return a result in O(1) time, thereby obviating the need for recursive searches that would be required if using regular addressable memory. Each entry in the lookup table inherently has an ordinal address, ranging from binary 0 to binary N−1, where N is the total number of entries in the CAM. If more than one matching entry is found, the CAM employs a prioritization scheme whereby one is chosen and used as the match address.
For example, in a typical networking application a search key may be formed by aggregating information from the packet header and payload as a packet arrives in the router. The search key is then used for lookup in one or more lookup tables stored in the CAM. If any matches are found, the respective match addresses for the respective entries are used as the basis for content addresses to access records in the respective AC tables. Multiple databases can be accessed concurrently in this manner, with the records containing the specific actions that should then be applied to the packet (e.g., metering and shaping parameters, quality of service provisions, packet counting and billing actions, DSCP remarking, CPU actions, etc.).
By placing the associated content in regular type memory instead of in CAM type memory multiple benefits are achieved. For example, CAM is much more expensive than regular memory, so a considerable cost savings can be had. This also permits flexibly configuring the search engine, since the width of the associated content is now tied to the regular memory rather than to the width of the CAM. Separating address lookup and associated content retrieval is now ubiquitous in the communications and networking industries, forming the foundation of many lookup engines in present use.
FIG. 1 (background art) is a schematic diagram depicting a typical CAM-based search engine 10. A processor 12, a CAM 14, an address unit 16, and an AC memory device (here a random access memory, RAM 18) are the major components of the search engine 10 in this example. The processor 12 hosts the underlying application and controls the various memory related operations. Today, the processor 12 will often be an application specific integrated circuit (ASIC), as shown. The RAM 18 may be a static or dynamic type (SRAM or DRAM), and some applications even use read only memory (ROM).
The processor 12 is connected to the CAM 14 by a search bus 20. In turn, the CAM 14 is connected to the address unit 16 by a match bus 22, and the address unit 16 is connected to the RAM 18 by an address bus 24. Finally, the RAM 18 is connected back to the processor 12 by a result bus 26. In many cases the address unit 16 is simply omitted and the match bus 22 and the address bus 24 then are effectively the same. A match value produced by the CAM 14 then is a match address. In FIG. 1, however, the address unit 16 is shown to emphasize that the match value obtained may be altered to form an actual content address that is used.
In use, a database 28 is stored by storing a table of lookup values in the CAM 14 and a table of associate content is stored in the RAM 18. To search then the database 28, the processor 12 provides a search key to the CAM 14 via the search bus 20. The CAM 14 performs a parallel search of all of the entries in its lookup table. When a match is found, the CAM 14 provides a match value to the address unit 16 via the match bus 22. From this, the address unit 16 derives the content address that is used to access a single record in the AC table. Finally, the record accessed is returned to the processor 12, via the result bus 26, as the search result.
Existing CAM-based search engines, using generally the scheme just described, are fairly straightforward devices, each storing one database, with an output bus or port providing access to the address space of the database's associated content. This approach needs to adapt, however, because applications increasingly require more power and faster throughput.
A complication with the existing CAM approaches so far has, been cascading multiple devices to form a larger, but still a single, database. This requires that the associated content share the same address space across all of the devices, and there must be a way to differentiate between entries within the different devices in the cascade structure (e.g., the system must be able to distinguish between the fourth entry of one device from the fourth entry of every other device in the cascade structure). To handle this, existing CAM-based search engines provide extra input pins that allow bits be append to the front end of an address. So, for instance, if eight devices are cascaded together, the host device needs to drive three extra pins that represent the eight devices in the cascade.
An obvious problem with this is that it adds cost in the form of extra pins, circuit footprint, etc. This also adds additional overhead that the host device must handle every time a match is generated from the CAM-based search engine cascade structure. Consequentially, as structures are scaled up to support larger database sizes and more cascaded devices, the complexity and cost of handling the output associated content addressing scheme increases as well.
In addition, the next generation of CAM-based search devices need to be much more powerful and flexible than those available today. For instance, it is desirable to have a CAM-based search processor that is able to execute not just a single lookup per cycle, but multiple simultaneous lookups per cycle. Such a search processor needs the ability to store many database tables (more than the number of simultaneous lookups). And it needs to be able to dynamically chose which database tables should be searched in a given cycle, and which output lookup port the result should go out on (since it offers simultaneous lookups per cycle). Clearly the output addressing scheme for this will have to adapt accordingly, and be commensurate with the power and flexibility of these next generation devices.