1. Field of the Invention
The present invention relates to the field of computer systems. More particularly, the present invention relates to cache memory on these computer systems.
2. BackGround
Typically a central processing unit (CPU) in a computer system operates at a substantially faster speed than a main memory of the computer system. Most computer systems provide cache memory which can operate at a higher speed than the main memory to buffer data and instructions between the main memory and the high speed CPUs. At any particular point in time, the cache memory stores a subset of the data and instructions stored in the main memory.
During read cycles, data and instructions are fetched from the cache memory if they are currently stored in the cache memory (read cache hits). Otherwise (read cache misses), they are retrieved from the main memory and stored in the cache memory as well as provided to the CPU. Similarly, during write cycles, data is written into the cache memory if the data is currently stored in the cache memory (write cache hits). Otherwise (write cache misses), the data is either not written into the cache memory (no write allocate) or written into the cache memory after forcing a cache line update (write allocate). Furthermore, data is written into the main memory either immediately (write through) or when a cache line is reallocated (write back).
Since the CPU goes idle in the event of a cache miss, the size and operating characteristics of the cache memory are typically optimized to provide a high cache hit rate, thereby reducing CPU idle time and improving system performance. As the speed of CPUs continues to get faster, various performance motivated approaches have also been developed to make cache hits faster or reduce cache miss penalty, thereby further reducing CPU idle time and improving system performance. Well known examples are virtual addressing to make cache hits faster, early restart and out-of-order fetching to reduce read miss penalty, use of write buffer to reduce write miss penalty, and use of two level caches to reduce read/write miss penalty. In the case of the two level cache approach, typically the first level cache is made small enough to match the clock cycle time of the CPU while the second level cache is made large enough to capture many fetches that would otherwise have to go to main memory.
However, traditional approaches to reducing CPU idle time and improving system performance seldom take the intrinsic characteristics of the applications that run on the computer systems into consideration, even though it is well known that many applications, due to their inherent nature, cause the CPU to go idle frequently and degrade system performance. For example, in many vector applications, it is quite common for a program to execute a statement like A[i]+B[i]=C[i], for i=1 to N and where N is a large number, and A, B and C are arrays. Assuming the starting addresses for arrays A and B are addr1 and addr2 respectively, the size of the array elements is 32 bytes and data is fetched in 32 byte blocks, the CPU will have to access addr1, addr2, addr1+32, addr2+32, addr1+64, addr2+64, and so forth. After at most n accesses, where n is the number of cache lines which is typically smaller than N, each subsequent access will result in a cache miss requiring access to the main memory. The data last fetched and stored in the cache lines are never used, they just keep getting overlaid.
Thus, it is desirable to have a cache memory whose design takes into consideration the inherent nature of some of the more popular applications that affect CPU idle time and system performance. In particular, it is desirable to have the cache memory's design take into consideration the inherent sequential access nature of vector applications. As will be disclosed, these objects and desired results are among the objects and desired results of the present invention.
For further description of cache memory, cache performance problems and improvement techniques, see J. L. Hennessy, and D. A. Patterson, Computer Architecture--A Quantitative Approach, pp. 402-461, (Morgan Kaufmann, 1990).