In a data processing system, instructions and associated data are transferred from memory to one or more processors for processing, and then resulting data generated by the processor is returned to memory for storage. Thus, typical processing operations involve frequent and repetitive reading and writing from memory. As a result, memory access delays are often a primary limitation in the performance of a data processing system. Preferably, therefore, memory access speed should be maximized to maximize performance. However, often cost and other constraints require that the main memory be comprised of relatively long access time circuitry. To overcome the resulting performance drawbacks, memory caches are typically used.
A memory cache typically includes a relatively small, but high speed, bank of memory, which can be more rapidly accessed by the processor(s) than the main memory. Memory locations in the main memory are duplicated in the cache. When a particular memory location being accessed by the processor is duplicated in the cache --event which is known as a cache "hit"--the processor may rapidly access the cache instead of waiting for access to main memory. The cache is managed with the goal of maximizing the fraction of accesses which are hits in the cache.
Caches are typically organized into "lines", which are relatively long sequences of memory locations found in main memory. Typically, when a memory location accessed by a processor is not duplicated in the cache--an event which is known as a cache "miss"--an entire line containing the missed memory location, and neighboring memory locations, is brought into the cache as part of retrieving the missed location from other caches or main memory--an event which is known as a "linefill" into the cache.
Typically, each cache line is associated with multiple groups of locations in the main memory. Each cache line stores duplicates of associated groups of memory locations, as well an indication of which groups of memory locations are currently stored in that line. Thus, when a processor requests access to a particular memory location, the cache line corresponding to that memory location is accessed to determine whether that cache line is storing the group of memory locations which includes the requested location. If so, the requested memory location is accessed in the cache. If not, a group of memory locations including the requested location is linefilled into the cache.
Typically, an n-way associative cache stores n of the several groups of locations corresponding to a cache line in the cache at one time. When a group of memory locations is linefilled into the cache, memory contents in the same cache location may need to be replaced. If the contents of the replaced cache line have been modified, then the line has to be stored back into the corresponding group of locations in the main memory--an event which is known as a "castback" or "writeback" from the cache.
In high performance data processing systems, often there are two or more caches, organized so that a processor attempts to access a memory location by first attempting to locate a duplicate of that location in a "level 1" or L1 cache. If there is a miss in the L1 cache, then an attempt is made to locate a duplicate of the desired memory location in a "level 2" or L2 cache. If there is a miss in the L2 cache, each lower level cache is sequentially checked in the same manner. If there is a hit in one of the caches, then the desired memory locations are obtained from that cache, and typically, the accessed memory locations are duplicated, along with neighboring locations completing a cache line, into the appropriate location of at least the L1 cache --although in some cases an access may be "cache-inhibited", in which case the data is not stored in the L1 cache after retrieval. If there are misses in all of the caches, the missed location, along with neighboring locations completing a cache line, is retrieved from main memory, and filled into one or more of the caches if the access is not cache-inhibited. Similarly, if a line is cast back from a cache, the line may be written to a higher level cache, main memory, or both.
Typically, lines of instructions and data are transferred from caches and processors to other caches and processors using buffers. For instance, in one architecture two buffers are respectively connected to a level 1 cache and a level 2 cache. These buffers are also connected to main memory, a host processor, and possibly other processors via a system bus. The buffers allow for a smooth transition of data or instructions between components having different transfer rates.
In multiprocessor systems, often one or more lower level caches or the main memory is shared by multiple processors. In such an environment, care must be taken that when the data is modified by a processor, the modifications are returned to the shared cache or memory before another processor accesses the data, so that processors do not perform operations on data which has not been updated. Typically, in such an environment, before a processor can modify data, it must request ownership of that data. Once ownership of the data is granted to a processor, that processor has exclusive access to the data, and other processors are prevented from accessing or modifying the data until it is written back to the shared cache or memory. If a first processor seeks to access data that is held exclusively by a second processor, the first processor requests ownership of the data; as a consequence, the second processor is forced to write the data back to the shared cache or data, and then data is then delivered to the first processor.
This typical structure can lead to inefficiencies in particular situations, for example, where two processors are simultaneously writing to the same data. In such a situation, the first processor will obtain ownership of the data to write to the data. Then, the second processor will request ownership in order to write to the data, forcing the first processor to write the data back to the shared cache or memory so that the data can be delivered to the second processor in an exclusive state. Then the first processor will request ownership in order to write to the data, forcing the second processor to write the data back to the shared cache or memory so that the data can be delivered to the first processor in an exclusive state. This exchange will repeat as long as both processors are attempting to write to the data, leading to an excessive amount of writebacks to the shared cache or memory and reduction in performance.
Accordingly, there is a need for a cache which is managed in a manner to improve its performance, particularly in a multiprocessor environment.