The present invention relates in general to the memory architectures of computers. More particularly, the invention is directed to an interface between first; level (L1) and second level (L2) caches in a multiprocessor architecture involving multiples of caches and cache levels.
Computer architectures involving multiple processors and multilevel cache architectures related to such multiprocessors have become areas of particular interest and technology growth in recent years. The prevailing investigations have dealt with two-level cache hierarchies and the related protocols for operating such caches to maximize individual processor performance while satisfying the cache coherence requirements of the composite systems.
Within such field of investigation, the study of the multilevel inclusion properties has proven to be of particular interest. The article entitled "Multilevel Cache Hierarchies: Organizations, Protocols and Performance"0 by Baer et al. as appeared in Vol 6 of the Journal of Parallel and Distributed Computing, pp. 451-476 in 1989, considers at length not only the implications of a hierarchical system, but more particularly the implications of multilevel inclusion. A variation of inclusion, the concept which prescribes that the lines of data stored in the L2 cache be a superset of the lines stored in the L1 caches supported by the L2 cache, is refined through the use of a split directory in the article entitled "Extended L2 Directory for L1 Residence Recording", as appeared in the IBM Technical Disclosure Bulletin, Vol. 34, No. 8, pp. 130-133 of January 1992. The extended directory further ensures comprehensive inclusion. The use of inclusion bits to selectively enable L1 level cache snooping of memory accesses is described in European Patent Application No. 91305422.7, published Dec. 18, 1991. The prevailing objective of the designs in the various references is to define and use a "strong inclusion" architecture, in contrast to one which provides "weak inclusion".
In considering the two extremes of inclusion, a system practicing "weak inclusion" merely maintains a superset condition between the L2 cache and associate L1 caches. As such, this technique does not maintain at the L2 cache adequate information for the L2 cache to reliably determine if any of the L1 caches has a line sought by a memory request detected at the L2 cache. In contrast, and at the other extreme of inclusion, "strong inclusion" mandates that there be comprehensive knowledge of the L1 contents and status at the associated L2, so that only those requests for cache lines known to be valid and stored in related L1 caches percolate up from the L2.
A weak inclusion architecture is easy to design because the superset condition can be achieved simply by practicing an invalidation of the corresponding L1 lines upon any replacement of the corresponding line in the L2 cache. However, a weak inclusion architecture provides the L2 cache with significantly less information about the contents of the L1. L1 invalidations occur much more frequently, bounding the usefulness of the L1 to the processor because of the frequent update interruptions required.
The memory access shielding provided by the L2 cache is improved through the implementation of a "strong inclusion" architecture. However, that architecture requires extensive communication between the L1 caches and the associated L2 cache to maintain adequate knowledge at the L2. Thus, while the strong inclusion architecture achieves the optimal shielding effect, overall system speed is degraded by the frequent communication between the L1 and L2 to maintain at the L2 the complete knowledge of the L1 contents.
Thus, there remains a need for a multilevel cache architecture which provides reasonable shielding by the L2 cache without introducing a detrimental volume of communication between the L2 and supported L1 caches. In most architecture, the L2 cache must serve multiple L1 caches, and their related processors, while satisfying all cache coherence requirements. Finally, the form of the communication between levels should be consistent with the resources existing in commercially available microprocessors which have on board L1 caches but no special L1 to L2 communication features.