The present application relates generally to an improved data processing apparatus and method and more specifically to an apparatus and method for gang fetching, gang replacement, and adaptive linesize in a cache.
A cache is used to speed up data transfer and may be either temporary or permanent. Memory caches are in every computer to speed up instruction execution and data retrieval and updating. These temporary caches serve as staging areas, and their contents are constantly changing. A memory cache, or “CPU cache,” is a memory bank that bridges main memory and the central processing unit (CPU). A memory cache is faster than main memory and allows instructions to be executed and data to be read and written at higher speed. Instructions and data are transferred from main memory to the cache in fixed blocks, known as cache “lines.”
Caches take advantage of “temporal locality,” which means the same data item is often reused many times. Caches also benefit from “spatial locality,” wherein the next instruction to be executed or the next data item to be processed is likely to be the next in line. The more often the same data item is processed or the more sequential the instructions or data, the greater the chance for a “cache hit.” If the next item is not in the cache, a “cache miss” occurs, and the CPU may go to main memory to retrieve it.
Caches are organized at a linesize granularity to exploit spatial locality. Using large linesize provides a performance improvement proportional to the amount of spatial locality in the memory reference stream. However, when spatial locality is low, using a large linesize may hurt cache performance. Smaller linesize provides a higher number of lines for a given space and less pressure on bandwidth.
The cache linesize may be determined at runtime depending on the spatial locality of the application. Unfortunately, current systems use the same linesize for all applications.