1. Field of the Invention
This invention generally relates to cache architectures for data processing systems and more particularly to a shared memory multi-processor data processing system having a plurality of second level caches.
2. Description of the Prior Art
Prior multi-processor data processing systems have used multi-level caching to enhance system performance. A popular configuration includes first level caches, a respective one coupled and dedicated to one of the processors, a second level cache that is shared by the processors, and a main memory that is above the second level cache in the storage hierarchy and which is shared by the processors. Data access time is reduced for data that is resident in a lower level in the storage hierarchy.
Two common types of second level caches that are found in the prior art are a centralized second level cache that maps to all of addressable memory, and a second level cache that is divided into multiple portions with each portion mapping to a predetermined address range of addressable memory. U.S. Pat. No. 5,265,232 to Gannon et al. and entitled, COHERENCE CONTROL BY DATA INVALIDATION IN SELECTED PROCESSOR CACHES WITHOUT BROADCASTING TO PROCESSOR CACHES NOT HAVING THE DATA, (hereinafter, Gannon) illustrates the centralized second level cache approach, and U.S. Pat. No. 5,423,016 to Tsuchiya et al. entitled, BLOCK BUFFER FOR INSTRUCTION/OPERAND CACHES, (hereinafter Tsuchiya) illustrates the non-centralized approach. The cache architecture of the co-pending patent application is similar to Tsuchiya.
In the system described by Gannon, there are multiple processors, each having a dedicated store-through first level cache. A centralized second level cache is shared by the plurality of processors and is mapped to all of the addressable memory. The second level cache has a priority control for selecting which memory request to process. Thus, all memory requests are funneled through the second level cache.
The cache architecture of Tsuchiya has a second level cache with multiple segments, where each segment is mapable to only a portion of the addressable memory. Each segment is dedicated to caching a predetermined range of the addressable memory space. A memory request is routed to a respective one of the second level cache segments depending upon the address range in which the memory request falls. As compared to Gannon, Tsuchiya reduces contention between the processors for access to the second level cache segments because there are multiple second level caches. However, the tradeoff is that there may be extra overhead in routing the memory requests to the proper second level caches.
In a multiprocessor system, a data coherency strategy must be implemented to ensure that each of the processors has a consistent view of the cached data. If each of two processors has the same addressable data unit in its first level cache and one of the processors modifies the data unit, the other processor must be notified that the corresponding data unit in its first level cache is invalid. Gannon uses a centralized directory for coordinating data coherency between the processors and their respective first level caches. In the co-pending patent application, duplicate-directories of the first level cache directories (hereinafter "duplicate tags") are used for coordinating data coherency.
The centralized directory of Gannon is built upon the concept of exclusive and public ownership of a data unit by the processors. Before a processor is allowed to modify a data unit, it must first obtain ownership of the data unit. The ownership of a data unit is maintained in the first level cache directory (hereinafter "tag") and in the centralized cache directory, thereby eliminating the need for duplicate tags. The concept of exclusive ownership eliminates the need to search and invalidate first level cache tags of the other processors when a data unit is modified. However, when a processor does not have exclusive ownership of a data unit and the processor needs to modify the data unit, the first level cache tags of all the other processors must first be searched and appropriate data units invalidated, thereby interrupting and adversely impacting all the other processors.
The duplicate tags of the co-pending patent application are used to filter invalidation requests to the first level cache tags. When a processor modifies a data unit in its first level cache, an invalidation request is broadcast to all the duplicate tags. The duplicate tags are searched for the address of the referenced data unit. If the address is found in a duplicate tag, the address is marked invalid and the corresponding address entry in the first level cache tag is also marked invalid. If the address of the referenced data unit is not present in the duplicate tag, the respective processor is not interrupted to invalidate an entry in its first level cache tag.