1. Field of the Invention
The present invention relates to semiconductor circuit design, and more particularly towards circuit structures for low power operation of content addressable memories and random access memories.
2. Description of Prior Art
Content addressable memory (CAM) circuit structures are typically used to allow fast and efficient searches, translations, or pattern matching of memory content. CAM provides a general solution to memory searches; unlike standard memories that associate data with an address, a CAM associates an address with data. When data is presented on the inputs of the CAM, the CAM searches for a match for the data in the CAM without regard to address. When a match is found the CAM identifies the address location of the data. In microprocessors, CAMs have been used most notably for tag matching in Translation Lookaside Buffers (TLBs) and associative caches, and to resolve instruction dependencies in rename and issue pipeline stages.
One problem associated with CAMs is power consumption. For example, in a CAM implemented as a dynamic wired XNOR function, where all match transistors are connected to one matchline, the matchline is precharged to a high logic value before evaluating the function. Whenever a mismatch occurs, the matchline is discharged. In situations where mismatches are prevalent, the frequent recharging of the matchline consumes considerable power. This high power consumption limits the physical size of CAMs and also limits its use in low power applications.
Recent work has proposed cascaded matching logic, where the match transistors are connected in serial to form an AND function rather than a wired XNOR function. This scheme prevents the matchline from being discharged every cycle when a mismatch occurs, thereby reducing power consumption. In order to provide high speed matching, a sense-amp can be added to the output. Alternatively, the AND function is decomposed into several sub-functions. While these methods reduce power consumption by reducing the frequency of precharging the matchline node, both of these approaches introduce additional sources of power consumption by introducing short circuit currents (sense-amp in its linear region) and/or internal nodes (decomposed AND function) that need to be charged and discharged. An AND function also introduces additional gate capacitance as extra transistors (static logic) and/or larger transistors (high stack domino logic) are introduced.
The main problem with the AND function approach however is that it is not as scalable as the wired XNOR approach. In an AND function the transistor stack height is dependent on the number of tag-bits that are to be matched. The delay of the logic (R*C) increases quadratically with stack height as capacitance (C) and resistance (R) is added for each additional transistor. In a wired XNOR function, the delay increase is linear with number of tag-bits, as only capacitance (but not resistance) is added to the critical path.
Referring to power consumption in dynamic wired XNOR match logic, one problem is the logic that interfaces to the match result. If the match result is connected to static logic, then intermediate nodes in the logic function may also be charged and discharged each cycle due to the precharging and discharging of the matchline node, unnecessarily wasting power. If dynamic logic is used, then it needs to be triggered by an evaluation signal rather than by the data signals (match results) themselves. This xe2x80x9csamplingxe2x80x9d of the data signals introduces extra delay overhead as safety margins need to be used to ensure that the domino logic is not evaluated before all data signals have reached their final value. Sampling also introduces additional power dissipation as the extra gate capacitance of the evaluation transistor of the domino logic needs to be driven each cycle. Referring to interfacing with logic dependent on the match result, the main problem with the dynamic wired XNOR match logic is that it generates an event on the matchline when there is not a match, rather than when there is a match.
Accordingly, it would be desirable to add the low power advantages of the AND function approach to the speed advantages of the dynamic wired XNOR function. Circuit structures that can combine these two features are presented in this text.
Random Access Memory (RAM) circuit structures are mainly used to efficiently store and read data. However, RAM structures generally have high power consumption due to precharging and discharging of high capacitance bitlines every cycle a read takes place. In addition, when sense-amps are used, two bitlines, data and dataxe2x80x2, are needed, further increasing power consumption. Fast sense-amps also consume significant power as the sense transistors are put in their linear region in order to react quickly to a voltage drop on one of the bitlines, creating short-circuit current. If sense-amps are not used, the read operation may be significantly slowed due to the time it takes for the size limited read and storage transistors to discharge the bitline. This text presents an alternative fast low power solution to this problem by reducing the capacitance on the bitlines through banking.
According to an embodiment of the present invention, a method is provided for matching two data sources through a wired exclusive-nor (XNOR). The method includes discharging a first tag line and a second tag line associated with a first tag bit, and precharging a matchline, connected to a plurality of tag match functions to a first potential, wherein each tag match function comprises one or more match logic devices. The method includes reading a plurality of tag bits, including the first tag bit and corresponding data bits, onto a plurality of corresponding tag lines and data lines respectively, and determining a match between each tag bit and data bit, wherein the matchline is pulled to a second potential upon each match logic device indicating a match, and wherein the matchline being held at the first potential upon any match logic device indicating a mismatch.
Each match logic device has a pulling strength, wherein the pulling strengths are ratioed, the match logic devices pulling to the first potential being stronger than the match logic devices pulling to the second potential, wherein upon the match logic devices simultaneously pulling to different potentials, the matchline is clamped at the first potential.
The method further.comprises pulsing a tag line by resetting the tag line to a logic 0 after the match logic devices have evaluated whether the corresponding tag bit has a logic value of 1. Pulsing a tag line further comprises latching a tag bit, and resetting a latch to a value of logic 0 upon determining the tag bit to be a value of logic 1 after the match logic devices have evaluated whether the corresponding tag bit has a logic value of 1. The method includes pulling the matchline partially towards the second potential upon determining a match for each tag bit, and pulling the matchline to the second potential using a sense-amplifier.
The XNOR function is a static wired XNOR function, and the method further comprises pulling, with at least one tag match function, the matchline to the second potential upon evaluating a match, and pulling, with at least one tag match function, the matchline to the first potential upon evaluating a mismatch, wherein the matchline implements a static wired XNOR function.
The wired XNOR function is implemented as a dynamic XNOR-AND function, and the method includes pulling, with a set of tag match functions, to the first potential, and pulling, through an AND structure of tag match functions, to the second potential, wherein one or more tag match functions are connected to one another in series forming an AND(XNOR(tag[i]), . . . ,XNOR(tag[i+j])) function, where i and j indicate the corresponding tag bits of the tag match functions.
The first potential is ground and the second potential is VDD. Alternatively, the first potential is VDD and the second potential is ground.
The method includes storing a plurality of data entries in a memory, wherein a matchline with associated match logic is replicated for each data entry, storing a tag in a latch, implementing, through the memory and the match logic, a content addressable memory, and implementing, through the matchline of each data entry, a wake up function.
The method includes inhibiting the evaluation of a matchline, wherein the content addressable memory device comprises a gating device connected to a tag valid signal and a clock for gating the clock to the tag latch, thus avoiding latching a new tag when the tag is invalid thus inhibiting the evaluation of the corresponding matchlines.
The precharge signal of each matchline is a clock. The precharge signal of each matchline is a delayed derivative of the matchline signal such that a self-resetting structure is implemented.
According to an embodiment of the present invention, a method of inhibiting the evaluation of a matchline of a content addressable memory device is provided. The method comprises gating a clock to a tag latch using a tag valid. signal, latching the tag valid signal, gating a precharge signal to a matchline using the latched tag valid signal, propagating a matchline value unchanged through a clearing device while the tag valid signal indicates that the tag is valid, and discharging the output of the clearing device while the tag valid signal indicates that the tag is invalid.
The method includes determining a wake up of a plurality of data entries through a ready logic, and discharging the output of the ready logic upon deassertion of a latched tag valid signal.
A method of reading a banked random access memory is provided according to an embodiment of the present invention. The method includes precharging a plurality of banked bitlines to a first potential, precharging an OR device connected to each banked bitline to a second potential, and applying a plurality of data and read signals to the read devices of each banked bitline. The method further includes pulling a banked bitline to the second potential upon a read device reading a data value matching the second potential, and evaluating the output of the OR device to.the first potential upon one or more banked bitlines being pulled to the second potential.
According to an embodiment of the present invention, a banked random access memory is provided, including a plurality of read devices connected in parallel to a read bitline, wherein each read device is further connected to a data signal and a read signal, each read device propagating a value of the data signal upon assertion of the read signal, a plurality of banks, each bank comprising a plurality of read devices connected in parallel to a read bitline, wherein the read bitline of each bank is precharged to a first potential, and a precharged OR-device connected to the read bitlines.
The OR-device is precharged to a second potential, the OR-device output is pulled to the first potential upon any read bitline being pulled to the second potential by a read device, and the OR-device output remaining at its precharged potential upon all read bitlines remaining at the first potential.
The precharge device of each bank bitline being activated by a delayed derivative of the bitline value, the precharge device of the OR-device being activated by a delayed version of the output of the OR-device such that a self-resetting precharged banked random access memory is implemented.
According to an embodiment of the present invention, a wake up device which detects source operands. The wake up device includes a content addressable memory based on a wired XNOR match function, including at least two tag lines for receiving data from a results tag latch and a tag drive, a first tag bit and data bit connected to a first match function, the first match function pulling to a first potential upon evaluating a match, at least a second tag bit and data bit connected to a second match function, the second match function pulling to a second potential upon evaluating a mismatch, and a precharged matchline connecting the tag match functions, pulling to the second potential, the precharged value adapted to indicate a mismatch of the matchline function.
The content addressable memory is dynamic wired XNOR based. The content addressable memory is static wired XNOR based and further comprises a first tag bit and data bit input to a first tag match function, the first tag match function pulling to a first potential upon a match, the first tag bit and data bit input to a second tag match function, the second tag match function pulling to a second potential upon a mismatch, and at least a second tag bit and data bit input to a third tag match function connected to the second potential, the third tag match function pulling to the second potential upon a mismatch.
The content addressable memory is dynamic wired XNOR-AND based, and includes a logic structure based on an AND function including a plurality of XNOR tag match functions each connected to an associated tag bit and data bit, the tag match functions connected in series, the logic structure pulling to the first potential upon all tag match functions in the logic structure indicating a match of their corresponding tag bit and data bit. The content addressable memory further includes zero or more successive tag match functions connected to associated tag bits and data bits, the tag match functions pulling to the second potential upon a mismatch, an end of the AND logic structure connected to the matchline, and a precharge transistor connected to a precharge signal, the matchline, and to the second potential for pulling the matchline to the second potential.
The wake up device further comprises an AND function connected to a tag valid signal and a clock for gating the clock to the results tag latch.
The wake up device includes a ready logic in the form of a plurality of transistors each connected to a content addressable memory matchline, wherein the transistors are connected in parallel-series stacks forming a domino gate, wherein a first end of the OR-AND gate is connected to a footing transistor, the footing transistor further connected to a precharge signal and to the second potential, and a second end of the gate connected to an output node, a precharge transistor connected to the output node, a precharge signal, and the first potential, the ready logic detecting the matching of multiple entries in the content addressable memory.
The first potential is ground and the second potential is VDD. Alternatively, the first potential is VDD and the second potential is ground.
According to an embodiment of the present invention, a wake up device is provided for detecting source operands. The wake up device includes a content addressable memory, a clock signal applied to an AND logic device, a tag valid signal applied to the AND logic device, the AND logic device gating the clock signal, and a tag latch receiving the gated clock signal.
The content addressable memory includes one of a dynamic wired XNOR based match function, a static wired XNOR based match function, and a dynamic wired XNOR-AND based match function.
The wake up device further comprises a latch adapted to latching a tag valid bit onto an internal valid node, a plurality of reset logic devices, each connected to the internal valid node, a content addressable memory matchline, and a unique output node, and a second AND logic device connected to the internal valid node and an inverted clock signal with an output adapted to gate the precharge transistor of a matchline.
The wake up device further comprises a passgate connected to a clock, a tag bit, and an internal tag bit node, a latch comprising two transistors in series connected to the clock, the internal tag bit node, a second potential, an output node, and two transistors in series connected to the clock, the internal tag bit node, a first potential and the output node, and a reset transistor connected to a feed back tag drive signal, the first potential, and to the internal tag bit node.
The first potential is ground and the second potential is VDD. Alternatively, the first potential is VDD and the second potential is ground.