The invention described herein relates to microprocessor cache apparatus.
Directory macro""s are microprocessor cache components that are used to determine if a particular address is currently held in the main cache RAM. FIG. 1 (denominated xe2x80x9cPrior Artxe2x80x9d) shows a typical cache directory RAM scheme. The directory RAM holds a portion of the addresses stored in the main cache RAM. A portion of the full address is used to retrieve and latch an entry from the RAM. Another portion of the address (tag) is then compared against the latched entry from the RAM. If the tag matches the entry retrieved from the RAM, the cache is said to have found a xe2x80x9chitxe2x80x9d.
FIG. 2 (denominated xe2x80x9cPrior Artxe2x80x9d) shows the typical logic flow from the RAM output to the generation of the hit signal. A set of bits from the RAM arc compared bit by bit with the corresponding tag bits using an XOR gate. The output of any XOR would be a xe2x80x9c1xe2x80x9d if the corresponding RAM bit and tag bit mismatched. If none of the XOR gates generate a mismatch, the output of the second stage nor gate will be a xe2x80x9c1xe2x80x9d, meaning the cache found a hit. The speed at which the cache can operate is directly affected by how long it takes for this compare structure to evaluate.
It is a primary object of the invention to provide a minimum delay cache RAM match/mismatch detection circuit.
It is a further object of the invention to eliminate the delay associated with complementing of the latched data.
It is still a further object of the invention to provide reduced delay XOR logic in the compare function.
These objects are attained by the apparatus of our invention. The apparatus of the invention provides a low delay circuit for generating a RAM cache xe2x80x9cmatch/mismatchxe2x80x9d signal.
Specifically, our invention provides a dual rail output from the RAM and associated latches, typically two latches, one latch to indicate if a xe2x80x9c0xe2x80x9d was read from the RAM and the other to indicate if a xe2x80x9c1xe2x80x9d was read from the RAM. In addition to removing the delay associated with generating the complement of the latch data as in the prior art, the use of a transmission gate XOR (instead of the gate input XOR of the prior art) further reduces circuit delay. This combination of eliminating generation of the complement of the latch data and using a transmission gate XOR in place of a gate input XOR provides a low delay compare function.
In a preferred exemplification the apparatus has tag and data inputs. The data inputs are dual-rail inputs sourcing latch pairs, where one of the latch pairs latches true-data and one of said latch pairs latches complement-data. The latch and compare apparatus further includes two sets of transistor pairs, one set of the gates receiving data from the true and complement outputs of the true-data latch, and another set of the gates receiving data from the complement and true outputs of the complement-data latch. The transistor pairs hold and drive signals from the latch pairs. The apparatus has a pair XOR transmission gates with one XOR transmission gate transistor pair receiving as gate inputs tag and inverted tag signals, and the other XOR transmission gate pair receiving as gate inputs inverted tag and tag inputs. One of the XOR transmission gate transistor pairs receives a true-data signal, and the other of the XOR transmission gate transistor pairs receives a complement-data signal. The XOR transmission gate pair outputs a match-mismatch signal.
In the apparatus the data input may be pairs of true-complement bit column inputs comprising an input for each data bit column, and the tag input may be an input signal comprising an input for each tag bit.
The transmission gates are in series with said data inputs, and the latch pairs are in series with the data inputs, with one of the latch pairs latching true-data inputs and one of the latch pairs latching complement-data inputs.