1. Field of the Invention
The present invention relates generally to memory circuits, and more particularly to priority encoders for content addressable memory (CAM) circuits.
2. Description of the Related Art
Modern computer systems and computer networks utilize memory devices for storing data and providing fast access to the data stored therein. A content addressable memory (CAM) is a special type of memory device often used for performing fast address searches. For example, Internet routers often include a CAM for searching the address of specified data. Thus, the use of CAMs allow routers to perform address searches to facilitate more efficient communication between computer systems over computer networks. Besides routers, CAMs are also utilized in other areas such as databases, network adapters, image processing, voice recognition applications, etc.
Conventional CAMs typically include a two-dimensional row and column content addressable memory core array of cells. In such an array, each row typically contains an address, pointer, or bit pattern entry. In this configuration, a CAM may perform xe2x80x9creadxe2x80x9d and xe2x80x9cwritexe2x80x9d operations at specific addresses as is done in conventional random access memories (RAMs). However, unlike RAMs, data xe2x80x9csearchxe2x80x9d operations that simultaneously compare a bit pattern of data against an entire list (i.e., column) of pre-stored entries (i.e., rows) can only be performed by CAMs.
FIG. 1A shows a simplified block diagram of a conventional CAM 10. The CAM 10 includes a data bus 12 for communicating data, an instruction bus 14 for transmitting instructions associated with an operation to be performed, and an output bus 16 for outputting a result of the operation. For example, in a search operation, the CAM 10 may output a result in the form of an address, pointer, or bit pattern corresponding to an entry that matches the input data.
Although conventional CAMs are becoming more powerful in their ability to perform searches more rapidly, each search can generate many search results that then need to be processed through a priority encoder (PE) to ascertain a match with the highest priority. Although there is a wide array of standard circuitry for completing priority encoding, as CAM memory arrays continue to grow in size and are required to operate at faster speeds, a PE must process more matches and also handle the generation of an address for a highest priority match in less time. In the prior art, attempts to address the need for speed and larger CAM arrays has been in increase the number of gates and complexity of the design. This solution has the downside of requiring more silicon area to layout the needed logic and also decreases cost.
Another downside of the prior art is that power consumption necessarily increases as the size of the PE design increases. The increased power consumption is generally due to the fact that PE designs require all of the logic blocks in different stages to turn ON, even when only one block in a given stage is actually contributing to the PE processing.
In view of the foregoing, what is needed is low power priority encoder circuitry that can provide increased performance for larger CAM arrays and can provide such increased performance in terms of speed with a design that requires less silicon area.
The present invention fills this need by providing CAM circuitry that includes a priority encoder that is scalable to meet a number of match line input configurations and is designed to intelligently operate in an efficient low power consuming manner. The priority encoder utilizes a multi-stage hierarchical architecture that ensures a high speed and low activity (low power) design. The priority encoder further utilizes a dynamic circuit layout so that chip area is conserved while maintaining the requirements of a high speed CAM. It should be appreciated that the present invention can be implemented in numerous ways, including as a process, an apparatus, a system, a device, or a method. Several embodiments of the present invention are described below.
In one embodiment, a priority resolver for use in a CAM circuit priority encoder is disclosed. The function of the priority resolver is to determine which of the N (where N is any integer greater than 2) matchline inputs are active and select the matchline with the highest priority (0 is highest priority and N is lowest priority). The output of the priority resolver is an N bit vector (called the resolved matchlines) with all outputs low (inactive) except for the output corresponding to the matchline with the highest priority. The priority resolver is also configured to generate global hit information, which is a logical OR function of all N matchline inputs. Additionally, the priority encoder is configured to generate a global model delay signal which mimics the worst case delay through the priority resolver, useful for controlling the high speed timing of the priority encoder. In this embodiment, the priority resolver includes one or more priority resolver sub-units which are connected in one or more stages. Each priority resolver sub-unit performs a similar function as the priority resolver, but on a smaller number of inputs.
When configured appropriately, the sub-units collectively perform the priority resolve function on the entire N matchlines inputs, and generate all the appropriate outputs of the priority resolver. Each priority resolver sub-unit can be configured to process M or more data inputs, where M (M is an integer greater than 1) is typically much less than N. The priority resolver circuit includes a dynamic OR circuit, local hit generation circuitry, a dynamic resolver circuit, a local model delay circuit, and an output differentiator and gating circuit. The dynamic OR circuit is configured to generate local hit information (pehit data). The local hit generation circuitry gates the input data with an enable signal and the pehit data. The local hit generation circuitry provides a way of saving power by reducing activity in the sub-unit. Also provided as part of a priority resolver sub-unit is a dynamic resolver circuit that is coupled to the local hit generation circuitry. The dynamic resolver circuit is configured to receive the outputs of the local hit generation circuitry and generate a resolved output vector.
Also included in the priority resolver sub-unit is a local model delay circuit which mimics the worst case delay through the sub-unit. The local model delay serves as a way for generating the global model delay signal of the priority resolver. An output differentiator and gating circuit is further provided as part of the priority resolver sub-unit and is configured to receive the output of the dynamic resolver circuit. The output differentiator and gating circuit serves as a way for minimizing common problems associated with dynamic circuits, which are spurious output transitions (due to input skew) and output skew. In this embodiment, the priority resolver sub-unit is implemented in one or more stages of the priority resolver, and each stage is configured to include one or more priority resolver sub-units. To reduce power only one (or at most only a few) priority resolver sub-units in each stage are configured to be activated by the enable signal.
In another embodiment, a priority resolver for use in a CAM circuit priority encoder is disclosed. The priority resolver includes one or more priority resolver sub-units. Each priority resolver sub-unit includes an local hit (pehit) generation circuitry. The local hit (pehit) generation circuitry is configured to generate pehit data. Also provided as part of a priority resolver sub-unit is a resolve processing circuit that is coupled to the local hit (pehit) generation circuitry. The resolve processing circuit is configured to receive the pehit data and an enable signal. An output differentiator and gating circuit is further provided as part of the priority resolver sub-unit and is configured to receive an output of the resolve processing circuit. In this embodiment, the priority resolver sub-unit is implemented in one or more stages of the priority resolver, and each stage is configured to include one or more priority resolver sub-units, in this embodiment however, only one priority resolver sub-unit in each stage is configured to be activated by the enable signal.
In yet another embodiment, a priority encoder is disclosed. The priority encoder includes: (a) a priority resolver that is configured to receive match line data, a priority encoder clock and generate a plurality of resolved match lines, a global model delay signal, and a pehit signal; (b) a priority encoder control block that is configured to receive a clock input, the global model delay signal from the priority resolver and generate a priority resolver master clock, a multiple match flop clock, a multiple match clock, an address encoder slave clock, and an address encoder sense clock; (c) a multiple match block that is configured to receive the match line data, the multiple match flop clock, a multiple match clock, and the plurality of resolved match lines from the priority resolver, and the multiple match block is configured to generate a MULT signal when multiple matches are detected; and (d) an address encoder that is configured to receive the plurality of resolved match lines, address encoder slave clock, address encoder sense clock, and is configured to communicate with the priority encoder control block and generate an address corresponding to the highest priority match input.
In still another embodiment, a priority resolver circuit is disclosed. The priority resolver circuit includes a first stage that has a first plurality of priority resolver sub-units. Each priority resolver sub-unit is configured to include local hit (pehit) generation circuitry, resolve processing circuitry, and output differentiator and gating circuitry. Further provided is a second stage that has a second plurality of priority resolver sub-units. In a third stage, a single priority resolver sub-unit is provided. In this embodiment, only one priority resolver sub-unit is configured to be active at one processing time in each of the first, second and third stages of the priority resolver circuit.
In another embodiment, a priority resolver circuit with N=4096 match line inputs is disclosed. The priority resolver includes a first stage with 256 priority resolver sub-units each having M=16 data inputs, a clock input, an enable input, a pehit output and M=16 data outputs. Each priority resolver sub-unit is configured to include local hit generation circuitry, local model delay circuitry, dynamic OR circuitry, and output differentiator and gating circuitry. Further provided is a second stage with 16 priority resolver sub-units each having M=16 data inputs, a clock input, an enable input, a pehit output and M=16 data outputs. In a third and final stage, a single priority resolver sub-unit is provided having M=16 data inputs, a clock input, an enable input, a pehit output and M=16 data outputs. In this embodiment, only one priority sub-unit in each stage is configured to be enabled to reduce power consumption. Alternate embodiments, include similar configurations but instead of enabling only one sub-unit per stage, all sub-units in any one stage are enabled. Enabling all sub-units in any one stage will boost performance at the expense of power. One reasonably skilled in the art, could determine that enabling the latter stages (with fewer sub-units) of the resolver is a good power versus performance trade-off. It is also apparent that one reasonably skilled in the art could conceive of alternate embodiments which include a heterogeneous mix of several different sub-units each varying in the parameter M.
The advantages of the present invention are numerous. Most notably, the priority resolver circuit is implemented in a multi-stage hierarchical architecture. The hierarchical architecture permits low power by enabling only a small number of priority resolver sub-units, but still maintains a high performance. In addition, the priority resolver employs low-power dynamic logic. The use of dynamic logic, as disclosed in the embodiments of the present invention, provide for high performance circuitry that can be compactly designed in silicon using less area. This advantage translates in reduced cost of manufacturing while providing the speed needed in today""s CAM applications, such as Internet related equipment. Another advantage of the present invention is that each priority resolver sub-unit, in one embodiment, includes local hit (pehit) generation circuitry and output differentiator and gating circuitry. The local hit generation circuitry permits low power operation by enabling the resolve processing circuits when needed. The output differentiator and gating circuitry is designed to isolate the resolve processing circuits of each priority resolver sub-unit so as to prevent inadvertent turn-ons when the particular priority resolver sub-unit is not the active stage device. This implementation, as described in greater detail below, provides for superior power savings and enhanced speed over the prior art. It is also important to note that a priority encoder of the present invention preferably includes unique multiple match circuitry. This multiple match circuitry is designed to compare resolved match line data and unresolved match line data and then rapidly indicate when multiple matches exist. In combination, the disclosed embodiments provide for a powerful priority encoder circuit that can significantly improve the performance of address generation in CAM circuits and their end-product implementation (e.g., routers).
Other advantages of the invention will become apparent from the following detailed description, taken in conjunction with the accompanying drawings, illustrating by way of example the principles of the invention.