The invention relates to techniques for priority encoding. The priority encoding techniques can be applied, for example, in a content addressable memory (CAM).
Various techniques for priority encoding (also known as resolving multiple responses) are known. Priority encoding has many different applications but is usually used to obtain a location of a selected response source. An exemplary application is in content addressable memory (CAM), where priority encoding is typically performed on the match signals from a CAM array to obtain a location of a priority matching entry, which can be thought of as the highest or lowest priority matching entry.
FIG. 1 shows features of a previously proposed priority encoding (PE) system 10. PE system 10 has M input lines that receive response signals R0 through RMxe2x88x921 from M sources. In response, system 10 provides priority signals P0 through PMxe2x88x921 on M output lines to a standard binary encoder (not shown). Priority signal Pi indicates that the respective response signal Ri is asserted and has priority; only one of priority signals P0 through PMxe2x88x921 is asserted at a time.
FIG. 1 shows circuitry 12 for the ith input line, and system 10 includes similar circuitry for each of the other input lines. System 10 includes priority encoder (PE) latch 14, which includes a flip-flop in each line""s circuitry, as illustrated by flip-flop 16 in circuitry 12. System 10 also includes a responder (RSP) and ripple inhibit chain 20 and NAND gate array 22.
In operation, response signals R0 through RMxe2x88x921 are first stored in PE latch 14, while RSP_b line 24 and inhibit line Ii+1 26 are precharged high. If there are any responders, RSP_b line 24 is pulled low by M3 transistor 28 in response to signal Di from the Q output of flip-flop 16. RSP_b line 24 serves as an N-input OR gate indicating whether there are any responders among the sources of response signals R0 through RMxe2x88x921. If there is a responder on the ith line Ri, M2 transistor 30 turns on in response to Di, pulling down the potential of Ii+1 line 26. This does not affect the inhibit lines for prior lines (i.e. for Ii, Iixe2x88x921, . . . I0), since M1 transistor 32 is turned OFF by signal Dixe2x88x92b from the not Q output (labeled Q_b) of flip-flop 16.
If the source of Ri is the uppermost responder, i.e. if none of the prior response signals R0 through Rixe2x88x921 is asserted, the Ii line is still at Vdd and the output of NAND gate 34 goes low when the Q output of flip-flop 16 goes high. The low output from NAND gate 34 indicates that Ri is asserted and has priority. Only one of the NAND gates in array 22 can be low at any time. The outputs of all the NAND gates in array 22 therefore form a one-of-M signal that is provided to the binary encoder (not shown), which returns the binary address of the uppermost responder.
Performance of the PE system described above is limited by the speed of the inhibit chain that includes inhibit lines I0 through IM, including inhibit line Ii+1 26. Without lookahead, speed is related to the delay through inhibit pass transistors, including transistor 32. The inhibit pass chain can be modeled as a lumped RC ladder network. The inhibit chain""s delay is directly related to resistance of transistor 32 and the other inhibit pass transistors and to the capacitance of the inhibit lines.
Performance of the PE system can be improved by breaking the inhibit chain into sections and inserting a non-inverting buffer between sections. The number of units per section can be chosen to minimize delay.
Performance can be further improved by using lookahead so that the inhibit signal does not have to propagate through all the sections of the inhibit chain. A long linear structure can be replaced by a branched tree-like structure that has less latency. The RSP_b signal, if sufficiently fast, is used to generate all necessary lookahead information. This is done by breaking the RSP chain into buffered sections such that each section is isolated from those below and therefore sends the proper signal to appropriate inhibit chains. The delay of the RSP chain can also be minimized.
FIG. 2 shows tree structure 40, which can aid in speeding up RSP and ripple inhibit chain 20. Tree structure 40 can be used, for example, in a PE system with 64 inputs and eight sections. Each section has eight of the input lines as in FIG. 1 and each also has a common RSP_B line 24 and a continuous inhibit chain with Ii+1 line 26. Tree structure 40 illustratively receives signals RSP_b1 through RSP_b7 from the RSP and ripple inhibit chain 20 of each of the first seven sections, sections 1 through 7. In response, tree structure 40 provides section inhibit signals SI2 through SI8 to the inhibit line of each of the last seven sections. In the illustrated circuit, the eighth section""s signal RSP_b8 is not used to obtain section inhibit signals because there is no ninth section.
If a section""s RSP_bj signal is low, indicating a match in the section, or if a prior section""s RSP_b signal is low, logic components 42 through 58, each of which could be an AND gate, provide a low section inhibit signal SIj+1 for the next section. For example, if inputs 3 and 62 are asserted, then RSP_b1 and RSP_b7 will each be asserted, i.e. low. Tree structure 40 will assert all of the section inhibit signals SI2 through SI8, inhibiting sections 2 through 8 from providing any asserted outputs. As a result, P3 would be asserted, but P62 would not.
Tree structure 40 improves performance because each section""s RSP_bi signal is obtained more rapidly than the inhibit signal from its last line because transistor 28 (and its counterpart for other lines in the section) obtain RSP_bj in parallel rather than in series like transistor 32 (and its counterparts for other lines in the section). The section inhibit signal SIj+1 therefore becomes available before the inhibit signal from the last line of the previous section would be available if the sections were connected without tree structure 40. If logic components 42 through 58 each include one AND gate, the worst case delay will occur when a signal passes through three AND gates, such as RSP_b4 through components 46, 48, and 50 or RSP_b6 through components 52, 56, and 58. More generally, for M sections, the delay is proportional to log2M.
It would be advantageous to reduce the number of transistors in priority encoders without reducing speed at which results are obtained.
The invention provides improved priority encoding techniques, including priority encoders that can be implemented with relatively few transistors while maintaining priority encoding speed. Some techniques use non-redundant groups of power-of-two subset signals, because such signals can be efficiently obtained from input signals and can also be efficiently used to obtain priority signals. Other techniques provide priority encoders with two circuits: One circuit responds to input signals, providing subset signals; another circuit responds to the subset signals, providing priority signals that indicate at most one input signal that is asserted and has priority. Each of these circuits can be a tree-like structure, the first with static logic and the second dynamic.
These and other features and advantages of the invention will be apparent from the following detailed description and drawings.