Digital computers often have a “cache memory” placed logically between the processor and main memory. A cache memory is useful because the large main memory is inevitably slow compared to the much smaller and generally faster cache memory. The cache can be placed physically closer to the processor and designed to hold, in a statistical sense, most of the data that the processor is referencing over brief periods of time.
In the case of a Symmetric MultiProcessor (SMP) system, in which a number of processors share the main memory, usually each processor has a cache and the cache is built very close to the processor with which it is associated. This makes for fast operation as long as the processors are not sharing data, but it is very awkward when they do share data. When processors share data, a “store” from one processor must somehow be communicated to other processors, which might have a copy of the data in their caches. For instance, a processor could read a variable into its cache, then perform an operation on the variable and write the new value of the variable into its cache. Other processors, with “old” copies of the variable, will not know that the value of the variable has changed unless the processor changing the data somehow communicates the change. This is called the “cache coherence” problem. Methods for solving this problem are known but will not be discussed here.
An alternative that avoids the cache coherence problem is to put the cache in the main memory, or put a separate cache in each memory bank if the main memory is divided into a number of banks. This solves the coherence problem but, in many designs, it makes the cache too far from the processor, increasing access times. For instance, a bus may separate a cache memory placed close to a processor by a small number of millimeters or even micrometers, while a main memory may be separated from a processor by a bus that is a relatively large amount of millimeters or even centimeters long. As is known in the art, in general, the longer are the metal runs making up a bus, the slower the access time to the memory. Additionally, long runs might require stronger drivers or perhaps repeaters.
Another problem with an SMP system, where each processor has a local cache, is that each cache may not be used to the same extent. For instance, one processor may only use half of its associated cache during certain times, while another processor may desire to use more than the space in its associated cache during these times. With a conventional SMP system, there is usually no way for one processor to access a cache associated with a different processor, in order to equalize cache usage. There are techniques that attempt to divide a problem amongst processors, in order to more evenly share the processor and cache loads. However, these techniques are complex, are often inexact, and might not use caches as efficiently as they could be used.
Therefore, a need still exists for associating caches with processors while avoiding, if desired, the cache coherence problem and avoiding the problems of having a cache dedicated to a single processor.