A computer system would ideally use very fast memory for all of its temporary storage needs. This would allow the Central Processing Unit (CPU) to operate at its designed speed, without the need to wait for slower memory devices. However, slower memory is often used because it is less expensive, consumes less power, and provides more storage in a given space than does very fast memory.
A characteristic of most computer applications programs is that they tend to perform repetitive operations on the same or neighboring pieces of data. Cache memory systems take advantage of this characteristic by storing recently accessed data in a small amount of very fast memory, called cache memory. Data which is read from slower main memory is stored in the faster cache memory, so that if a program must subsequently use the same data, this data may be read from the cache memory. Thus, cache memory systems increase the apparent speed of memory accesses in computer systems.
A cache memory system must keep track of main memory addresses for which the data is available in the cache. When data is available in the cache, the main memory access is aborted in favor of cache access. This is called a cache "hit." The frequency of cache hits may be increased in many ways. One method is to use an algorithm for deciding which data to place in cache that is tailored to the particular computer application. Another method for increasing the frequency of hits is to use a larger cache memory.
The optimum amount of cache memory in a computer system depends on many factors, including the particular application, the target cost of the system, the hardware used in the system, and the relative costs of main and cache memory. Some systems may be optimized by using multiple smaller caches, rather than one large cache. For instance, multiple smaller caches may be mapped into non-contiguous blocks of memory.
A cache system which utilizes multiple cache memories must avoid contention interference between the caches. Contention occurs when more than one cache responds to a memory address with a cache hit.
Another problem may occur when a CPU utilizes a "burst-mode" operation. Burst-mode operations are performed on data in a sequential series of memory locations. Rather than have the CPU execute a new instruction to address each individual memory location, burst-mode allows the CPU to execute a single instruction specifying a starting memory address, an operation to be performed, and the length of the memory block on which to perform the operation. In such cases, the memory access is preferably done in burst-mode. This may cause special problems in a snooping multiple cache system when the starting memory address for a burst-mode operation starts in one cache and is completed in a different cache.
In a multiprocessing environment, the caches not only service their CPUs, but monitor memory bus access that is initiated by other memory bus masters. In a `copy back` implementation, this activity is called bus snooping. Since data adherency is of paramount importance, the caches may have to stall their CPU's access in favor of the memory monitoring.