1. Field of the Invention
The present invention relates generally to computer cache schemes, and more particularly to a method and apparatus for maintaining coherence between memories within a system having distributed shared memory.
2. Description of the Related Art
Due to the demand for greater processing power, and due to the desire to have relatively small processors work cooperatively as a multi-processing system, there have been many attempts over the last several years to solve the problems inherent in maintaining coherence between memory devices which are accessible to more than one processing device. The coherence problem is exemplified by a system in which a single logical interconnect channel is shared by a plurality of memory devices, each paired with an associated processor to form a "tightly-coupled" multi-processor system. Reference from a processor may be to addresses that are within the memory associated with the requesting processor or within a memory associated with another of the processors within the system.
Each processor is also associated with a local cache. The cache is a relatively small high speed memory device which allows data to be accessed rapidly. The cache is typically relatively small because the cost of such high speed memory is relatively high. However, in systems in which the processors are very long instruction word processors (VLIWs) the caches are unusually large, increasing the likelihood that information stored in a cache associated with a first processor may also be stored in the cache associated with at least a second processor. It should be understood that there is a tradeoff between the size and the speed of the cache.
Consider a cache that is a two level cache in which the first level of the cache is smaller and faster than the second level of the cache. Therefore, the data which is most likely to be required is maintained in the first level of the cache, and data which is less likely to be required is maintained in the second level of the cache. If data is required which is not present in either the first or the second level of the cache, then the data must be retrieved from the slower memory device. A number of different algorithms have been devised for determining what data to maintain in each level of the cache (i.e., anticipating what data is most likely to be required soon). However, a detailed understanding of such algorithms is not necessary for the purposes of the present discussion.
Since each processor has an associated cache, and each cache may have more than one level, care must be taken to ensure the coherence of the data that is maintained throughout the system. That is, if more than one cache may contain a copy of the same interval of shared, writable memory, a reasonable approach to parallel programming requires some provision for ensuring that all copies reflect a coherent value. One way to ensure that coherence is maintained is to maintain a directory (or table) which points to each non-local cache in which the data resides. By knowing the location of each copy of the data, each copy can either be updated, or a notation can be made within the table to indicate that the data at one or more locations is out-of-date. Such tables require pointers to multiple nodes which are caching data. However, maintaining one table which points to each copy within each node increases the complexity and the width of directory entries within such tables, making the table relatively large and complex.
Accordingly, it is an object of the present invention to provide a system and method for maintaining coherence of data stored in multiple caches within a multiprocessor system by storing relatively small and simple entries in coherence tables.