1. Field of the Invention
The present invention is generally in the field of processors. More specifically, the invention is in the field of cache memories.
2. Background Art
As is generally known, computer programs continue to increase in size. As computer programs grow in size, the memory requirements of the computer and various memory devices also increase. However, as the size of a program currently residing in the computer""s main memory (also referred to as the xe2x80x9cexternal memoryxe2x80x9d in the present application) gets larger, the speed at which the processor executes tasks begins to decrease. This results from the constant fetching of instructions from the main memory of the computer into the processor (also referred to as a xe2x80x9cCentral Processing Unitxe2x80x9d or xe2x80x9cCPUxe2x80x9d). The larger the program currently being used, the more often instructions must be fetched. This fetching process requires a certain number of clock phases. Therefore, the more often instructions have to be fetched from the main memory, the less time the processor has available to decode and execute those instructions and the slower the speed at which the processor can finish tasks.
Thus, it is desirable to set aside in a local memory, i.e. a memory requiring less access time than the main memory, a limited number of program instructions that the processor may want to fetch. An instruction cache is such a local memory. An instruction cache is a relatively small memory module where a limited number of program instructions may be stored.
The processor performs constant checks to determine whether instructions stored in the main memory required by the processor are already resident in the instruction cache. If they are already resident in the instruction cache, the instruction fetch step is performed by referring to the instruction cache, since there is no need to go to the main memory to find what is already in the instruction cache.
Thus, the processor must be able to determine if an instruction to be fetched from the main memory is already resident in the instruction cache. The processor""s program counter contains the address of an instruction needed by the processor. One way to determine if an instruction is already resident in the instruction cache is to keep track of the addresses of the instructions when they are first brought into the instruction cache from the main memory. To do this, copies of certain upper bits of the instruction addresses (also referred to as the xe2x80x9cinstruction addressesxe2x80x9d in the present application) are stored in a tag memory bank where each entry in the tag memory bank is referred to as a xe2x80x9ctag.xe2x80x9d As an example, the upper 22 bits of a 34-bit instruction address can comprise the tag. These upper 22 bits of the 34-bit instruction address are referred to as a xe2x80x9ctag,xe2x80x9d and the individual bits in the tag are referred to as xe2x80x9ctag bitsxe2x80x9d in the present application.
When the processor wishes to determine whether a particular instruction is resident in the instruction cache, the address of the instruction is sent from the program counter across the address bus to the instruction cache and the tag memory bank. In the present example, the 22-bit tags within the tag memory bank and 32-bit wide instructions in the instruction cache are read. The upper 22 bits of address of the instruction contained in the program counter are then compared with a tag in the tag memory. If there is a match, also referred to as a xe2x80x9chit,xe2x80x9d the instruction is already resident in the instruction cache, and it is not necessary to fetch the instruction from the main memory. If there is no match, also referred to as a xe2x80x9cmiss,xe2x80x9d the instruction must be fetched from the main memory at the address contained in the program counter.
A xe2x80x9cset-associativexe2x80x9d cache consists of multiple sets, each set consisting of an instruction cache and a tag memory bank. A set-associative cache decreases the number of instances where the program is required to return to the main memory. This is because a number of instruction caches hold instructions corresponding to a number of different segments of a computer program. Thus, the speed at which the processor executes a program increases since there is a greater chance that the processor can find a desired instruction in the set-associative cache.
A set-associative cache also has disadvantages. Because there are multiple tag memory banks, each tag memory bank must be accessed to determine if a tag which is resident in that bank matches the corresponding upper bits contained in the program counter. In the present example, each tag memory bank must be accessed to determine whether it has a tag which matches the upper 22 bits in the program counter. Power is consumed each time a tag and an instruction are read from a tag memory bank and an instruction cache, respectively. For example, if the set-associative cache has two tag memory banks and two instruction caches, each time the processor accesses the set-associative cache, two instructions and two tags are read. Thereafter, at most a single tag is matched and an instruction corresponding to the matched tag is identified as the desired instruction. Thus, the power consumed in a set-associative cache increases as the number of tags read and the number of instructions read increase.
Thus, although a set-associative cache increases the speed with which the processor executes tasks, there is a corresponding increase in power consumption resulting from the reading of the additional tags and instructions from the additional sets of instruction caches and tag memory banks. Using the example above, it can be seen that in addition to the power consumed from reading and comparing the four tags, power is consumed reading four instructions, although at most only one of the instructions will be the desired instruction.
Thus, it can be seen that there is a need in the art for a method to implement a set-associative cache which maintains the advantages discussed above, such as increased operating speed, while at the same time reducing the additional power consumption inherent in a set-associative cache.
The present invention is directed to apparatus and method for reducing power consumption in a cache. The invention""s set-associative cache maintains the advantages of increased operating speed, while at the same time reducing the power consumption inherent in a set-associative cache.
According to the present invention, a power reduction signal (also called a xe2x80x9csame blockxe2x80x9d signal in the present application) is generated. The power reduction signal indicates whether a subsequent instruction to be fetched from an instruction cache belongs in the same block as a previous instruction fetched from the same instruction cache. When the subsequent instruction belongs to the same block as the previous instruction, there is no need to perform a tag read or an instruction read from an instruction cache other than the same instruction cache which contains the block to which the subsequent instruction belongs. Since the number of tag reads and instruction reads are dramatically reduced, the power consumption in the cache is also significantly reduced.
In one embodiment, the power reduction signal is generated by a logical combination of an increment address signal and a signal indicating if a block boundary has been crossed.