The present invention relates to a cache memory, and more particularly, to a cache hit logic for determining whether a tag address stored in a tag memory coincides with an input tag address.
When various typical programs are analyzed, it is noted that reference of a memory for a given time tends to be performed only in a limited region. Such a phenomenon is referred to as locality of reference and is readily understood from the fact that typical computer programs use various program roofs and sub-routines and sequentially proceed. Also, reference of a memory of data tends to be limited to a region. A table-lookup process and a repeated process of referring to a common memory and arrangement correspond to the typical computer programs.
When the programs and data that are frequently referred to are stored in a small memory having high speed, average memory access time is reduced. Therefore, the total time required for executing programs is reduced. Such a small memory having high speed is referred to as a cache memory. According to recent technology, the cache memory is integrated with a single chip together with a processor.
The basic operation of the cache memory is in accordance with the following. When it is necessary for the processor to access the memory, the cache is first investigated. When desired words are found in the cache, the words are read. When desired words are not found, the main memory is accessed in order to read words. The block that includes the words is transmitted from the main memory to the cache memory. The size of the block is about 1 to 16 words.
The performance of a cache memory is measured by a hit ratio. When a processor refers to a memory, if desired data is found in a cache, it is referred to as hit. If the desired data is not found in the cache and is found in a main memory, it is referred to as miss. The ratio obtained by dividing the number of hits by the total number of references of a memory performed by the processor is referred to as hit ratio. The hit ratio is experimentally measured by performing typical programs of a computer to count the number of hits and the number of misses for a given time. In general, the hit ratio is no less than 0.9, which verifies the locality of reference of a memory.
A cache hit logic is a circuit for determining whether data required by a processor is stored in a cache memory and is provided in the cache memory. FIG. 1 illustrates a typical cache hit logic. FIG. 2 is a timing diagram of signals used for the cache hit logic illustrated in FIG. 1.
Referring to FIG. 1, a cache hit logic 100 includes a tag memory cell array 110, a row decoder 120, a column decoder 130, a sense amplifier and a latch circuit 140, a comparison logic 150, and an output circuit 160.
The comparison logic 150 includes XNOR gates 151 to 154 for determining whether a tag address TAGOUT<n:0> sensed and latched by the sense amplifier and the latch circuit 140 coincides with a tag address TAGADD<n:0> input from a processor (not shown). The XNOR gates 151 to 154 correspond to the bits of the tag address TAGOUT<n:0> and the bits of the tag address TAGADD<n:0>, respectively. The XNOR gates 151 to 154 output logic ‘1’, that is, high level comparison signals X<n:0> when the input tag address bits coincide with each other.
The output circuit 160 includes AND gates 161 to 166 for outputting a high level hit signal HIT when the comparison signals X<n:0> from the XNOR gates 151 to 154 are at a high level.
However, since the output circuit 160 of the cache hit logic 100 illustrated in FIG. 1 includes AND gates of various stages, it takes a relatively long time to output the final hit signal HIT. Therefore, it is difficult to realize a high-speed cache hit logic.
FIG. 3 illustrates another circuit of a conventional cache hit logic. Referring to FIG. 3, an output circuit 360 of a cache hit logic 300 includes a PMOS pre-charge transistor 370, NMOS transistors 371 to 378, and a latch 361. The gate of the pre-charge transistor 370 and the gates of the transistors 375 to 378 are connected to a clock signal CLK. When the clock signal CLK is at a low level, a node N1 is pre-charged to a source voltage level by the pre-charge transistor 370.
When the clock signal CLK is transitioned to the high level, the pre-charge transistor 370 is turned off and the NMOS transistors 375 to 378 are turned on. At this time, comparison signals Y<n:0> output from XOR gates 351 to 354 of a comparison logic 350 turn on or turn off the NMOS transistors 371 to 374.
When the tag address TAGOUT<n:0> sensed by a tag memory cell array 310 coincides with the input tag address TAGADD<n:0>, the comparison signals Y<n:0> are at the low level. Therefore, the NMOS transistors 371 to 374 are turned off such that the first node N1 is maintained at the pre-charge level. As a result, a low level hit signal nHIT is output through the latch 361.
If even one bit between the tag address TAGOUT<n:0> sensed by the tag memory cell array 310 and the input tag address TAGADD<n:0> is not the same, a comparison signal corresponding to the bit that discords is at the high level. When even one among the NMOS transistors 371 to 374 is turned on, the first node N1 is discharged. As a result, the high level hit signal nHIT is output through the latch 361.
Delay of the output circuit 360 in the cache hit logic 300 illustrated in FIG. 3 is shorter than delay of the output circuit 160 in the cache hit logic 100 illustrated in FIG. 1. However, when the period of the clock signal CLK changes, it is not possible to guarantee the reliability of the hit signal nHIT.
FIG. 4A illustrates a setup margin after the comparison signals Y<n:0> are output until the hit signal nHIT is output when a frequency F is ½T. If the period of the clock signal CLK is properly controlled when the cache control logic 300 is designed, it is possible to secure an optimal setup margin.
However, when it takes long to sense the tag address TAGOUT<n:0> from the tag memory cell array 310 in a state where the period of the clock signal CLK is determined, the hit signal nHIT may be output at an undesired level. Also, when the period of the clock signal CLK is short, it is not possible to secure a setup margin such that the hit signal nHIT may be output at an undesired level.
On the other hand, as illustrated in FIG. 4B, when the period of the clock signal CLK is set long (F=1/T) in order to synchronize the cache memory with peripheral circuits, the setup margin is long. However, the operation speed of the cache hit logic 300 is reduced, which operates as a limitation on designing a high-speed cache memory and processor.