1. Field of the Invention
This invention relates to improvements in computer memory systems, and more particularly to improvements in cache memories associated with computer memory systems.
2. Background Information
As set forth in a survey of some aspects of cache memory design by A. J. Smith, "Cache Memories", Computing Surveys, Vol. 14, No. 3, Sept., 1982, pp. 473-530, cache memories are small high speed memories used in modern, medium and high speed computers to temporarily hold those portions of the contents of main memory which are believed to be currently in use. Since instructions and data in cache memories can usually be referenced in 10 to 25 percent of the time required to access main memory, cache memories permit the execution rate of the machine to be substantially increased.
Thus, a central processing unit (CPU) of a computer with a cache memory needs to spend far less time waiting for instructions and operands to be fetched and/or stored. For example, in typical large, high-speed computers (e.g., Amdahl 470V/7, IBM 3033), main memory can be accessed in 300 to 600 nanoseconds, whereas information can be obtained from a cache in 50 to 100 nanoseconds. Since the performance of such machines is already limited in instruction execution rate by cache memory access time, the absence of any cache memory would produce a substantial decrease in execution speed.
Virtually all modern large computer systems have cache memories; for example the Amdahl 470, the IBM 3081, 3033, 370/168, 360/195, and Univac 1100/80, and the Honeywell 6/80. Also, many medium and small size machines have cache memories; for example the DEC VAX 11/780, 11/750 and PDP-11/70, and the Apollo, which uses a Motorolla 68000 microprocessor. Even microcomputers benefit from an on-chip cache, since on-chip access times are much smaller than off-chip access times.
The success of cache memories has been explained by reference to the "property of locality". The property of locality has two aspects, temporal and spatial. Over short periods of time, a program distributes its memory references nonuniformly over its address space, and which portions of the address space are favored remain largely the same for long periods of time. This first property, called temporal locality, or locality by time, means that the information which will be in use in the near future is likely to be in use already. This type of behavior can be expected from program loops in which both data and instructions are reused. The second property, locality by space, means that portions of the address space which are in use generally consist of a fairly small number of individually contiguous segments of that address space. Locality by space, then, means that the loci of reference of the program in the near future are likely to be near the current loci of reference. This type of behavior can be expected from common knowledge of programs: related data items (variables, arrays) are usually stored together, and instructions are mostly executed sequentially. Since the cache memory buffers segments of information that have been recently used, the property of locality implies that needed information is also likely to be found in the cache.
Optimizing the design of a cache memory generally has four aspects:
(1) Maximizing the probability of finding a memory reference's target in the cache (the hit ratio);
(2) minimizing the time to access information that is indeed in the cache (access time);
(3) minimizing the delay due to a miss; and
(4) minimizing the overheads of updating main memory, maintaining multicache consistency, etc.
All of these have to be accomplished under suitable cost constraints, of course.
Reference is made particularly to M. Badel, et al, "Performance evaluation of a cache memory for a minicomputer," Proc. 4th Int. Symp. on Modelling and Performance Evaluation of Computer Systems, Vienna, Austria, Feb., 1979; H. Barsamian, et al, "System design considerations of cache memories,", Proc. IEEE Computer Society Conference, IEEE, New York, pp. 107-110 (1972); D. H. Gibson, "Consideration in block-oriented systems design," Proc. Spring Jt. Computer Conf., Vol. 30, Thompson Books, Washington, D. C. pp. 75-80 (1967); and K. R. Kaplan, et al, Cache-based computer systems, IEEE Computer, Vol. 6, No. 3, pp. 30-36 (Mar., 1973). See also, D. W. Clark, et al., "The memory system of a high performance personal computer", IEEE Trans. Comput., Vol. TC-30, No. 10, pp 715-733 (Oct., 1981), which discusses the design details of a real cache. See also, B. W. Lampson, et al., "A processor for a high-performance personal computer," Proc. 7th Annual Symp. Computer Architecture, ACM, New York, N. Y., pp. 146-160 (May 6-8).
The relationship of a memory cache to the CPU and memory in a computer system of the prior art is shown in the block diagram of FIG. 1. Thus, as shown, a cache memory 12 is ordinarily located between the CPU 13 and main memory 14. A secondary memory 17, such as a disk memory or the like, may also be included, connected to the main memory 14, as shown.
In many main-frame computers, as well as minicomputers with virtual memory, the cache is addressed by real addresses, rather than virtual addresses. Examples of such computers are Amdahl 470, IBM 3081, Univac 1100/80, Honeywell 66/80, DEC VAX 11/780, 11/750. This is so, because these computers have multiple virtual address spaces, typically one per process. For example, the operating system has its own virtual address space, separate from those used by the user processes. In such machines, the same virtual addresses in different virtual spaces are mapped onto different physical addresses; on the other hand, different virtual addresses in different address spaces may be mapped onto the same physical address (in fact, this is the mechanism to allow sharing of information between two different virtual address spaces).
If the cache is addressed with virtual addresses in machines with multiple virtual address spaces, the cache mapping mechanism becomes very complex, because the mapping mechanism also has to keep track of the address space. As a result, most of these older generation machines, first map the virtual address onto the real address, before accessing the cache map. Usually, the virtual to real address translation process is time consuming, compared to the cache access time, and can become a performance bottleneck.
Because of the multiple virtual address space problem, only a few computers with virtual memory have virtual address caches. Examples of such computers are the MU-5, the S-1, the IBM 801, and the ICL 2900. The virtual address cache design is discussed by S. Bederman, "Cache management system using virtual and real tags in the cache directory", IBM Tech. Disclosure Bull., Vol. 21, No. 11, p. 4541 (Apr., 1979) and by A. G. Olbert, "Fast DLAT load for V=R translations, IBM Tech. Disclosure Bull., Vol 22, No. 4, p. 1434, (Sept., 1979). As will become apparent, the invention is particularly suitable for use in conjunction with computer systems such as those described in copending United States patent applications by Oxley et al, entitled "COMPUTER MEMORY SYSTEM", Ser. No. 630,476, filed July 12, 1984, and by Thatte el al., entitled "COMPUTER SYSTEM ENABLING AUTOMATIC MEMORY MANAGEMENT OPERATIONS", Ser. No. 630,478, filed July 12, 1984, said applications being assigned to the assignee hereof, and incorporated herein by reference.
As its name implies, the logical address cache is addressed by logical addresses generated by the CPU. As described in said copending patent application Ser. No. 630,476, a logical address is a pair, (r index), where "r" is the identification of a binding register and "index" is the index of a cell in the memory block bound to the binding register, r. Since none of the existing machines have the notion of logical addresses, they do not have caches that can be addressed by logical addresses.