1. Field of the Invention
This invention relates generally to a cache memory and more specifically to such a memory which is highly suited for use with multiprocessor systems.
2. Description of the Prior Art
In order to improve effective memory transfer rates and, accordingly, raise processing speeds, it is a known practice to provide a high speed cache memory between a processor and a low speed main memory. The cache memory works by saving a duplicate copy of the most recently used data. When the processor asks for data, a cache control circuit (viz., a tag address comparator) checks to see if the data is in the cache memory. If it is, the processor gets it quickly because the cache memory is very fast. Otherwise, the data is derived from the slower (but larger) main memory. When that happens, the cache memory copies the data derived from the main memory so that it will be available quickly the next time.
This procedure is effective because most computer programs have "locality", that is, they tend to work with a small group of items for a long time, whereby a high hit rate can be achieved.
Detecting a cache hit or a cache miss is described briefly hereinbelow and will be discussed later in more detail with reference to FIG. 1. A tag address storage section, which forms part of a cache memory, outputs a plurality of tag addresses in response to a set address involved in an access address signal applied from the processor. Following this, the tag address comparator compares the tag address from the tag address storage section with a tag address involved in the access address signal from the processor, and generates a signal indicating a cache hit or a cache miss.
When cache memories are applied to a multiprocessor system, a cache memory is arranged between each processor and the main memory. However, application of known cache memories to such a multiprocessor system has encountered the problems that the system may cause erroneous operations or the operation of a processor is forced to a stop temporarily. These problems arise from the fact that each of the conventional cache memories is provided with only a single tag address comparator. Before describing this invention in detail, the problems of the prior art will further be discussed with reference to FIG. 1.
FIG. 1 is a block diagram showing a known cache memory unit of the type to which this invention is applicable. When such a cache memory is applied to a multiprocessor, a plurality of caches are prepared and each is arranged between each of a plurality of processors (CPUs) and a system bus to which a main memory is coupled.
The FIG. 1 arrangement generally comprises a data storage section 9, a tag address storage section 6, a valid tag address information storage section 7, and a least recently used (LRU) WAY information storage section 4, wherein the term "WAY" indicates one of divided small storage regions of a memory. Each of the storage sections 6, 7 and 9 consists of a plurality of WAYs (four WAYs, for example, but only two WAYs are shown in the Figure). Each WAY of the data storage section 9 is divided into four blocks 1 through 4 and stores data for quick reference by a processor (not shown), while each WAY of the tag address storage section 6 stores tag addresses of the data saved in the corresponding WAY of the data storage section 9. On the other hand, each WAY of the valid tag address information storage section 7 stores tag address information as to whether or not each of the tag addresses in the corresponding WAY of the section 6 is valid, while the LRU WAY information storage section 4 stores information as to which WAY should be updated by the data derived from a main memory (not shown) via a system bus in the event of a cache miss.
As illustrated in FIG. 1, an input/output (I/O) data buffer 1 is provided for temporarily holding data that will be subsequently delivered to the processor, the data storage section 9, or the main memory. An access address buffer 2 is coupled to receive an access address signal from the processor, and is divided into three buffer sections 2a, 2b and 2c which are assigned to temporarily hold tag, set and block addresses, respectively. A controller 3, comprised of complex logic circuitry, is coupled to a control terminal of the processor. The controller 3 will not be disclosed in detail for brevity.
The set address is applied, from the set address buffer 2b, to each WAY of the storage sections 6 and 7, and causes the storage sections 6 and 7 to respectively output the corresponding tag address and the corresponding tag address information therefrom, which are sent to a tag address comparator 5. The comparator 5 checks to see if the tag address from the buffer 2a coincides with any of the tag addresses from the tag address storage section 6, and also checks to see if a matched tag address (if any) is valid.
In the event that the tag address from the buffer 2a is equal to one of the tag addresses from the storage section 6 while the matched tag address proves to be valid, the comparator 5 outputs two kinds of hit signals: a cache hit or miss signal and a WAY hit signal. The cache hit or miss signal is applied to the processor via the controller 3, while the WAY hit signal is sent to a write WAY selector 8 and a read WAY selector 10.
The cache memory has three kinds of operation modes: (1) a cache hit READ mode in which the data saved in the cache is read out to the processor in response to a cache hit; (2) a cache hit WRITE mode in which the data saved in the cache is updated or rewritten by the data applied from the processor upon a cache hit; and (3) a cache miss BLOCK WRITE mode which occurs in the event of a cache miss and in which a data element required by the processor is transferred from the main memory to the processor and substantially at the same time the four data blocks containing the data element in one block are saved in the cache in order to be available next time.
In the cache hit READ mode, the read WAY selector 10 selects one of the WAYs of the data storage section 9 in response to the WAY hit signal and the data required by the processor is specified by set and block addresses applied to the storage section 9. Thereafter, the data defined in the selected WAY is applied to the processor through the I/O data buffer 1. On the other hand, in the case of the cache hit WRITE mode, the write WAY selector 8 selects one of the WAYs of the data storage section 9 and the address of the data transferred from the processor is specified by set and block addresses applied to the storage section 9. Therefore, the data applied from the processor is written into the address in the selected WAY in order to update same.
In the cache miss BLOCK WRITE mode, the comparator 5 fails to ascertain a matched valid tag address and a cache miss is detected. When this happens, the comparator 5 applies a cache miss signal to the processor through the controller 3. In response to the cache miss signal, the processor accesses the main memory to derive therefrom a data element not found in the cache memory through an I/O data buffer/controller 13 and the I/O data buffer 1. On the other hand, a block load buffer 12 receives, from the buffer/controller 13, the four-block data in one of which the data element transferred to the processor is contained. The four-block data is sequentially, block by block, written into the WAY which has been selected by a block load WAY selector 11. It should be noted that the selector 11 is controlled by the output of the LRU WAY storage section 4.
As mentioned previously, each of the conventional cache memories in a multiprocessor system is provided with only one tag address comparator 5. Accordingly, when data in a given cache is rewritten, it is impossible to check whether or not the data with the same address of another cache has already been rewritten without access to the other cache. Therefore, there is a possibility that the non-updated content of the another cache is erroneously transferred to the processor assigned to the another cache. In order to avoid such a problem, the operations of the processors should be prevented until all the caches except for the actually updated cache are checked not to have incorrect data. This procedure causes a further reduction in system performance.