1. Field of the Invention
This invention relates generally to the field of digital computers and more particularly to the area of memory hierarchy control within a computer. More specifically it relates to a snyonym detection and handling mechanism for caches in a virtual storage system.
A cache synonym is a cache directory entry defined as containing an absolute address translated from a current request's logical address which does not directly locate any entry in the directory containing this absolute address.
This application includes the disclosed subject matter in U.S.A. patent application Ser. No.: 205,500 filed on Nov. 10, 1980 by F. O. Flusche et al and assigned to the same assignee as the subject application.
2. Description of the Prior Art
In a data processing system with a storage hierarchy, selected lines of data in a main storage are copied in a high speed buffer, often called a cache, for fast access by a processor. Whenever the processor requests data, the system first checks the cache to determine whether the data is available in the cache; and if it is, the data is quickly provided to the processor. If the data is not available in the cache, the data is retrieved more slowly from the main memory. A portion of the untranslated logical address in each processor request is used to directly address the cache directory, rather than use its translated absolute address because system performance is significantly decreased by waiting for the translated address.
Caches in current systems are typically based on the concept of "set associativity", wherein a requestor directly addresses a cache directory row (called a class) which may have several entries (called sets). The sets in a class are all associatively searched in parallel to determine if any one set in the class has the absolute (or real) address translated for the requested logical (or virtual) address. A set-associative cache is a compromise between a slowly-performing fully associative cache, in which any block of main storage may map into any position in the cache, and a fast-performing directly addressed cache, where each main storage address can map into only one class location in the cache. Fully associative caches have the liability of lengthy directory search time and an elaborate replacement (LRU) mechanism. Non-associatively addressed caches are the simplest to implement in terms of hardware, but yield significantly lower performance than the other two schemes due to increased overwriting of entries.
The size of a set-associative cache can be increased by either (1) increasing the number of classes in the cache directory by increasing the address range used to access a cache class, and/or (2) increasing the number of sets in each class. If system performance is not to be decreased, an increase in the set associativity requires extra hardware to examine all sets in the addressed class in parallel. Also, available integrated circuit packaging technology for cache directories does not easily lend itself to a substantial increase in set associativity. The constraints prefer that the cache size be increased by increasing the number of classes in the cache directory. However, as the number of classes in the cache directory is increased, eventually the directory address bits taken from a requesting logical address must expand beyond its nontranslatable field (i.e. the D field) and into its translatable field.
The cache synonym problem occurs when the cache address uses bits from the translatable field of the logical address. A cache synonym exists when the data required by a requesting logical address is available in a cache class different from the class addressed by the request. Synonyms may for example be caused by (1) requests which switch between virtual and real addresses for the same data, or (2) by one user addressing a line of data with one virtual address and another user addressing the same line with a different virtual address which locates a different class in the cache, or (3) by reassignment of the page frame to be accessed by a logical address.
A random relationship exists between a logical address and its translated absolute address. This relationship is dependent upon the assignment of a logical address to any available page frame in main storage. Thus, a given logical address can translate to any absolute address in main storage where the assigned page frame happens to be located.
Accordingly, the values of the translated bits in an absolute address are only determined at the time of translation and they may have any value. Thus, the value in any subset of bit positions in the translatable field of an absolute address is not dependent upon the value in the corresponding subset of bit positions in the related logical address; and they may have any value within the range of their permutations.
U.S. Pat. No. 3,723,976 to Alvarez et al, issued Mar. 27, 1973 and assigned to the same assignee as the subject application, teaches a different cache synonym technique not used in the subject application. In Alvarez, each processor (which is shown in a multiprocessing environment) has associated with it a store-in-cache, a fetch directory (FD), a broadcast store directory (BSD), and a translation directory (TD). An entry in the FD is accessed by the logical (virtual) address bits 18-26 of a processor request, but the corresponding entry in the BSD is accessed by real address bits 18-26 obtained from TD as a translation of the requested logical address. Hence, the BSD is not a copy directory of the FD, because the corresponding entries in BSD and FD for the same processor request can map to different locations in BSD and FD. The FD entries contain real addresses (bits 8-19), while the BSD entries contain a mixture of real and virtual address bits (i.e. real bits 8-17 and virtual bits 18, 19), in which the BSD virtual bits 18,19 point to the corresponding entry in the FD. The virtual bits 18,19 in a BSD entry locate a corresponding FD entry. The result is that a principle class is accessed in FD using the request address; and if the request misses in FD, BSD is then examined to determine if any synonym location exists in FD and therefore in the cache. A double replacement invalidation problem exists for U.S. Pat. No. 3,723,976 which is not found in the subject invention and is caused by corresponding entries in BSD and FD not being at the same directory address. The double replacement invalidation may occur when any valid FD entry is replaced by a new entry. This single FD entry replacement causes the invalidation of two FD entries and two BSD entries in the example in Alvarez FIG. 7 which requires two invalidations and two block castouts from the cache when the invalidated blocks were change. This double invalidation occurs when the new and replaced corresponding BSD entries are in different BSD locations. For any new FD entry, four possible corresponding BSD locations exist due to the four possible translatable values for virtual bits 18,19. Hence, the replaced BSD entry corresponding to the replaced FD entry, and the new BSD entry corresponding to the new FD entry have a three out of four chance of occupying different BSD locations than the same BSD location.
A result of Alvarez potential double replacement castout is duplication of attendant hardware, e.g. duplication of the line store buffers, duplication of related RAS hardware entities for castout, etc., and most of all the significant decrease in main storage (MS) performance due to cluttering the bus to MS with additional castouts, using up unnecessary bandwidth.
There is no double replacement invalidation or double replacement castout with the subject invention, due to its use of a copy directory (CD) and processor directory (PD) combination instead of the FD and BSD combination used in the Alvarez patent, because corresponding CD and PD entries always map into the same address in both PD and CD, unlike in Alvarez FD and BSD. Also, the subject invention is capable of performing synonym detection with only a single directory, i.e. in the PD.
U.S. Pat. No. 4,332,010 issued May 25, 1982 entitled "Cache Synonym Detection and Handling Mechanism" by B. U. Messina et al (assigned to the same assignee as the subject application) detects synonyms by constructing a cache directory in a novel manner that divides the directory into 2.sup.N groups of classes in which N is the number of translatable bits in the cache directory address for locating any class in the directory. Each of the 2.sup.N groups in the directory is addressed in parallel by the current processor request. Each group has the potential for containing either a principle cache hit or a synonym hit, although only one of them may be put in the directory and cache. A cache miss occurs when no principle or synonym cache hit is found in any group. No copy directory is used in detecting a cache synonym, as is done in the invention in the subject application.
U.S. Pat. No. 4,136,385 to Gannon et al, issued Jan. 23, 1979, and assigned to the assignee of the present application, is concerned with the dynamic lookaside address translation (DLAT) synonym problem. The DLAT synonym problem is not related to the cache synonym problem with which the subject invention is involved. The Gannon patent provides a control means for handling common-segment DLAT synonyms used with multiple virtual storage systems in which a common page has the same virtual address in plural address spaces. The control means eliminates plural entries for a common page in a translation lookaside buffer (DLAT) by providing a common indicator in each DLAT entry containing a common page address to eliminate associating the entry with any particular address space.
IBM Maintenance Library, "3033 Processor Complex, Theory of Operation/Diagram Manual" Vol 4, Processor Storage Control Function (PSCF) and Processor Storage, Form SY22-7004-0, pages 1.4.1 and 1.6.2 show and describe a large high speed buffer concept, in which a sixteen-way set associative cache is used to avoid the synonym problem by avoiding the use of any translatable bits from the processor request logical address for addressing the cache directory.