In processing systems, typically a main memory stores information for use by the central processing unit (CPU). The operating speed of the CPU is generally significantly greater than that of the main memory. Hence processing systems commonly utilize a cache memory as an interface between the main memory and the CPU. The cache memory is a fast memory, whose operating speed better matches that of the CPU. The cache memory stores selected contents of the main memory, such as information that was most recently requested by the CPU, which the CPU is likely to request in the near future.
When the CPU requests information to be read from memory, the request goes to the cache memory, which checks to determine whether it contains the requested information. If so, the cache supplies the information to the CPU; if not, the cache requests the information from the main memory and upon receipt both stores it and supplies it to the CPU. Similarly, when the CPU requests new information to be written to memory, the cache memory checks to determine whether it contains the old memory information that is to be overwritten. If so, the cache memory generally stores the new information in place of the old and either immediately or at some time thereafter also writes the new information into the main memory; if not, the cache memory either writes the new information into the main memory, or it requests from the main memory the old information, stores it, and then proceeds as if it had contained the old information.
Information transfers between the main memory and the cache memory are often done in units of information called blocks. A block is a predetermined plurality of consecutive main memory locations, or words. The information contents of a block are likewise referred to as a block, with the understanding that it is the information contents of memory locations, and not the locations themselves, that are transferred.
To improve the speed of block transfers in the processing system, the main memory and communication buses connecting the main memory with other devices, such as the cache memory, are often capable of transferring more than one word of information simultaneously. For the same purpose, the cache memory is often interleaved. An interleaved memory is one comprising a plurality of memory parts which are capable of being read or written simultaneously. For purposes of this application, the term memory part refers to one or more memory devices functioning together to produce one-word-wide storage. For example, a memory part for 32-bit words may be made up of four 8-bit-wide memory devices.
Words simultaneously received by interleaved memory are simultaneously stored in different ones of the memory parts. Likewise, words that are to be simultaneously transmitted by the interleaved memory are simultaneously retrieved from different ones of the memory parts. An interleaved cache memory therefore can receive and store a plurality of words simultaneously.
Interleaving of cache memory has been expensive, because of the multiplicative increase in the number of memory parts and associated circuitry, such as reading, writing, and addressing circuitry, that has been required in order to implement the interleaved memory. For example, two way interleaving, to allow simultaneous storage and retrieval of two words, has required the use of two memory parts and duplication of the associated circuitry; three way interleaving, to allow simultaneous storage and retrieval of three words, has required the use of three memory parts and triplication of the associated circuitry; and N-way interleaving has required the use of N memory parts, each with its own associated circuitry.
In the prior art, this problem has been especially acute for set-associative cache memories. A set-associative cache memory is one wherein any one memory word may be stored in any one of a plurality of predetermined cache memory locations. Each of these predetermined memory locations is a member of a different set. Hence there are as many sets as there are different locations in which a given memory word may be stored in the cache. Increasing the number of locations in which a word may be stored increases the chance of a particular word being found stored in the cache memory, and hence increases the effectiveness of the cache in speeding up system operation. But in order for a set-associative cache to operate rapidly, all of the predetermined locations in which a given word may be stored must be examined at the same time for the presence of a desired word. For this purpose, each set is commonly implemented in a different memory part, and the plurality of memory parts are accessed in parallel. Hence in a set-associative memory there commonly are at least as many memory parts as there are sets. And when a set associative memory is interleaved, commonly each of the memory parts and its associated circuitry are replicated in the manner described above for interleaved memory. Thus, if an M-way set-associative cache memory, i.e., one having M sets, is N-way interleaved, it typically comprises at least M.times.N memory parts. While these increases in the number of memory parts provide added performance, by increasing the cache size, such increases in the numbers of required parts quickly become excessive, making the use of interleaved set-associative cache memories uneconomical for most applications.