The present invention relates to a superscalar microprocessor, more particularly, to an LRU (Least Recently Used) memory for storing the history of memory access which is made for replacement or updating the content of the memory.
As is well known in this art, the superscalar structured microprocessor performs two or more instructions at every cycle. For doing this, a cache memory, TLB (Translation Lookaside Buffer), BTB (Branch Target Buffer), and etc. must support two or more ports and are in general implemented by 4-way set associative.
At this time, for the purpose that LRU memory required for implementation of 4-way set associative supports two or more ports, it is noted that the results of entries being simultaneously accessed should be reflected to change the entry access history (LRU data).
As an example, considering the memory access result of a cache memory for supporting two port, there are cases that both of two ports are missed; that the first port is missed and the second port is hit; that the first port is hit and the second port is missed; and that both of two ports are hit. The cache memory access information corresponding these four cases is discriminated so as to reflect the entry access history (that is, LRU data), in which the operations of accessing the LRU memory, modifying the LRU data, and then writing the modified LRU data to the LRU memory should be performed at every cycle. Such operations are made in a cache control block in the conventional.
In this case, the LRU data path is read out from the LRU memory and transferred to the cache control block. The modification (or reflection) of the LRU data path is made in the cache control block and then transferred to the LRU memory to be recorded. This may be a critical path of the cache memory access.
FIG. 1 illustrates a conventional 8-entry 4-way set associative LRU memory for supporting two ports, which includes an address decoder block 102 and an LRU SRAM block 104.
Referring to FIG. 1, the address decoder block 102 latches and decodes the 3-bit index address for 8 entries in synchronization with a CLOCK. In addition, when the LRU.sub.-- READ signal is activated, the word line signal WORD[7:0] is made in synchronization with the READ.sub.-- TIME signal and then outputted to the LRU SRAM block 104. Also, when the LRU.sub.-- WRITE signal is activated, the word line signal WORD[7:0] is made based on the decoding result in synchronization with the WRITE.sub.-- TIME signal so as to be outputted to the LRU SRAM block 104.
When the LRU.sub.-- READ signal is activated, the LRU SRAM block 104 reads only the enabled entry in response to the WORD[7:0] supplied from the address decoder block 102, and produces the stored 3-bit READ.sub.-- DATA to a control block (not shown). Alternatively, when the LRU.sub.-- WRITE signal is activated, the LRU SRAM block 104 stores the externally applied 3-bit WRITE.sub.-- DATA to the enabled entry in response to the WORD[7:0] supplied from the address decoder block 102.
The conventional LRU memory as described above, can be easily implemented, but has a difficulty in that the control block should control the READ.sub.-- TIME, WRITE.sub.-- TIME, WRITE.sub.-- DATA, and etc., in high accuracy. Also, there is a burden in that the operation of the receiving the READ.sub.-- DATA and generating the WRITE.sub.-- DATA (LRU updating data) reflecting the way-hit information, must be speedily performed. Moreover, there is a problem that, since all of the reading out operation from the LRU memory and the operations of the control block for receiving the read out data READ.sub.-- DATA and for generating and then transferring the WRITE.sub.-- DATA to the LRU memory, must be performed at every cycle, these operations impede the implementation of high performance.