1. Field of the Invention
The present invention relates to computers and, more specifically, to a circuit employed in address prediction in a computational circuit.
2. Description of the Prior Art
One method of improving computer speed is referred to as “pipelining,” in which instructions are fed into a pipeline for an execution unit in a multi-stage processor. For example, to process a typical instruction, a pipeline may include separate stages for fetching the instruction from memory, executing the instruction, and writing the results of the instruction back into memory. Thus, for a sequence of instructions fed into a pipeline, as the results of the first instruction are being written back into memory by the third stage of the pipeline, a next instruction is being executed by the second stage, and still a next instruction is being fetched by the first stage. While each individual instruction may take several clock cycles to be processed, since other instructions are also being processed at the same time, the overall throughput of the processor may be greatly improved.
Further improvement can be accomplished through the use of cache memory. Cache memory is a type of memory that is typically faster than main memory in a computer. A cache is typically coupled to one or more processors and to a main memory. A cache speeds access by maintaining a copy of the information stored at selected memory addresses so that access requests to the selected memory addresses by a processor are handled by the cache. Whenever an access request is received for a memory address not stored in the cache, the cache typically retrieves the information from the memory and forwards the information to the processor.
The benefits of a cache are maximized whenever the number of access requests to cached memory addresses, known as “cache hits,” are maximized relative to the number of access requests to non-cached memory addresses, known as “cache misses.” One way to increase the hit rate for a cache is to increase the size of the cache. However, adding size to a cache memory may increase costs associated with the computer and may extend the access time associated with the cache.
For integer and commercial code streams, manipulation of addresses dominates the workload. In an instruction cache, the array area is generally proportional to the maximum size of the addresses stored therein. In attempting to implement value prediction for 64-bit architectures, the cache array area used to cache predicted values doubles over 32-bit architectures. This not only costs significant chip area but it slows down the value predict cache/array access as well.
In a 64-bit architecture, the higher-order bits of 64-bit addresses tend to be highly correlated in time, especially for segmented address models. This occurs because typically only four to eight segments are hotly active at a given point in the program, even though dozens of addresses may exist. In a typical 64-bit architecture, only about 32 lower-order bits of an address change in a nearly random manner, whereas the higher-order 32 bits tend to fall into several slower-changing patterns. Over the course of several hundred sequential cycles, these slower changing patterns of higher-order bits are limited in number and appear to be static.
Therefore, there is a need for a circuit that predicts addresses while using only a limited amount of cache space.