The present invention relates to a cache apparatus and a control method for managing a cache memory by a multiprocessor system and, more particularly, to a cache apparatus and a control method for enabling a cache memory to hold data whose validity is uncertain and allowing a processor to speculatively process the data.
Generally, an access time to a main storage of a computer system is extremely slower as compared with an operating speed of a processor. When an access to the main storage occurs, therefore, the processor waits for data which is transmitted from the main storage. A cache apparatus is a memory which operates at a high speed although a capacity is smaller than that of the main storage, is arranged between the processor and the main storage, and reduces an apparent access time to the main storage, thereby reducing a data waiting time of the processor.
FIG. 1 shows a table structure of a cache memory provided for a cache apparatus. A cache line 122 serving as a unit of storage comprises a tag 124 and data 130. The tag 124 includes a status tag 126 and an address tag 128. When a reading request is received from a processor, the cache memory searches the cache line having the address tag 128 whose address value coincides with that of a request address. When the cache line having the coincident address value does not exist, it results in a cache miss. Even if there is a cache line having the coincident address value, it results in a cache miss so long as the data held in the cache line is invalid on the basis of the status tag 126. In case of the cache miss, it is necessary to access the main storage or another cache memory. When the cache line in which the address value of the address tag 128 coincides exists and the data held in the cache line is valid on the basis of the status tag 126, it results in a cache hit. In case of the cache hit, since it is unnecessary to access to the main storage or another cache, a data waiting time of the processor is short. As mentioned above, in the cache memory apparatus, when the cache line whose address value coincides with that of the request address exists, whether it is necessary to access to the main storage or another cache memory or not and whether the data waiting time of the processor is short or not are determined in accordance with whether the data held in the cache line is valid or invalid.
On the other hand, in a multiprocessor system, in many cases, each processor has a cache apparatus. As for the data in the cache memory provided for each processor, it is necessary to keep correctness of the data, namely, consistency of the data among the cache memories. To keep the consistency of data, each cache memory is managed in accordance with a rule called a cache coherence protocol to keep matching performance of the data among the cache memories. Consequently, it is possible access the data in the cache memory. A state where the consistency of the data is maintained among the cache memories in the multiprocessor system will now be explained with respect to a cache apparatus using an MESI protocol known as an invalidation type cache coherence protocol as an example.
FIG. 2A shows a status transition for a reading request of the MESI protocol. FIG. 2B shows a status transition for a writing request of the MESI protocol. Symbols of the status transition denote the following contents.
M: Modified. Valid data has been held only in one of a plurality of caches and the data has been modified. It is not guaranteed that a value of the data is the same as that in the main storage.
E: Exclusive. Valid data has been held only in one of a plurality of caches.
S: Shared. The same data has been held in a plurality of caches.
I: Invalid. The data in the cache is invalid.
self: Case where a request from a self processor has been processed.
other: Case where a request from another cache apparatus has been processed.
self-if copy: Case where the cache is in the invalid status for a reading request and the other caches hold the data.
self-if no copy: Case where the cache is in the invalid status for a reading request and no cache holds the data.
In this case, since the state where no data exists in the cache is equivalent to a case where data has been held in the cache in a status of Invalid I in a practical sense, for convenience, the Invalid I status denotes both the following cases.
I. Case where the data has been held in the Invalid I status into the cache.
II. Case where no data exists in the cache
Similarly, such an expression that no data exists in the cache denotes both the following cases.
I. Case where no data really exists in the cache.
II. Case where the data has been held in the Invalid I status into the cache.
FIG. 3A shows cache apparatuses of the multiprocessor system. Cache apparatuses 100-1 and 100-2 are provided for processors 102-1 and 102-2, respectively, and they are connected to a main storage 106 in common through a bus 104. As shown in FIG. 3A, it is assumed that data has been held in the Shared S status into a certain cache line in the cache memories of the cache apparatuses 100-1 and 100-2. As shown in FIG. 3B, when a writing request for the data of the cache apparatus 100-1 is issued from the processor 102-1, the data in the cache apparatus 100-1 is modified and the Shared S status is changed to the Modified M status. However, the consistency of the data between the cache apparatuses 100-1 and 100-2 cannot be maintained if the data in the cache apparatus 100-1 is merely modified. Therefore, after the invalidating operation to modify the Shared S status of the cache apparatus 100-2 to the Invalid I status is performed, the operation to modify the data in the cache memory 100-1 is needed. After that, as shown in FIG. 3C, when the processor 102-2 issues a reading request for the modified data, since the data in the cache apparatus 100-2 has been invalidated to the Invalid I status by the writing request from the processor 102-1 in FIG. 3B, the reading request from the processor 102-2 issued after that causes a cache miss.
Ordinarily, a block size of the cache line is larger than a size of data that is handled by the processor. It is now assumed that the cache block size is equal to the size of two words and the size of the data which is processed by the processor is equal to the size of one word, namely, the half size. As shown in FIG. 4A, it is assumed that the cache apparatuses 100-1 and 100-2 hold the same data existing in cache lines 122-1 and 122-2, for example, data having a 2-word length in which the data in the head 1-word portion is set to xe2x80x9c12xe2x80x9d and the data in the latter 1-word portion is set to xe2x80x9c34xe2x80x9d. In such a state, as shown in FIG. 4B, it is assumed that the processor 102-1 generates the writing request for the data in the head 1-word portion of the cache line 122-1 and the value is rewritten from xe2x80x9c12xe2x80x9d to xe2x80x9c56xe2x80x9d. At this time, since the status of the cache memory of each of the cache apparatuses 100-1 and 100-2 is managed on the unit basis of each of the cache lines 122-1 and 122-2, all of the data in the cache line 122-2 of the cache apparatus 100-2 is invalidated by the status transition to Invalid I. Subsequently, as shown in FIG. 4C, it is assumed that the reading request of the cache line 122-2 in the cache apparatus 100-2 from the processor 102-1 is a reading request for the data in the latter 1-word portion. In this case, in spite of the fact that the values of the data held in the latter 1-word portions of the cache lines 122-1 and 122-2 of the cache apparatuses 100-1 and 100-2 are the same as xe2x80x9c34xe2x80x9d, since the status of the cache line 122-2 is Invalid I, it results in a cache miss. The access operation to the main storage or another cache occurs and the processor 102-2 enters a data waiting state. It is the cache miss due to a so-called false sharing. The invalidating operation causing such a cache miss is called xe2x80x9cunpreferable invalidating operationxe2x80x9d.
On the other hand, as shown in FIG. 5A, it is assumed that certain data has been held in the Exclusive E status in the cache apparatus 100-1 of the processor 102-1 and there is no data or the data has been held in the Invalid I status in the cache apparatus 100-2 of the processor 102-2. In this instance, as shown in FIG. 5B, when the reading request for the data in the cache apparatus 100-2 is issued from the processor 102-2, the data held in the Exclusive E status in the cache apparatus 100-1 is transferred to the cache apparatus 100-2 and provided to the processor 102-2. Consequently, the data is shared by the cache apparatuses 100-1 and 100-2. Accordingly, the operation to transfer the data from the cache apparatus 100-1 or main storage to the cache apparatus 100-2 and the operation to change the status from Exclusive E of the cache apparatus 100-1 to Shared S are needed. As mentioned above, after the status of the cache apparatus 100-1 was changed from Exclusive E to Shared S in response to the reading request from the processor 102-2, as shown in FIG. 5C, when the processor 102-1 issues the writing request, the invalidating operation to change the status of the cache apparatus 100-2 from Shared S to Invalid I is needed. After the invalidation, the status of the cache apparatus 100-1 is changed from Shared S to Modified M. The recent processor has functions for dynamically or statically predicting a branching of the conditions and speculatively executing the commands. Consequently, the processor issues a speculative reading request. The execution of the speculative reading command is a command execution based on the prediction. Accordingly, when the prediction is correct, it is fine. When it is wrong, the speculative execution result is abandoned. In other words, there is a possibility that the data read out by the reading request executed speculatively is not used. Even when the reading request from the processor 102-2 in FIG. 5B is a reading request based on the erroneous prediction, the operation to change the status of the cache apparatus 100-1 from Exclusive E to Shared S is executed in response to the reading request from the processor 102-2. As shown in FIG. 5C, therefore, in the writing request from the processor 102-1 which is issued after that, the invalidating operation to change the status of the data of the cache apparatus 100-2 in the status of Shared S which is not needed essentially to Invalid I is necessary. The operation to change the status of the cache memory to Shared S due to the execution of the reading request whereby the read-out data is not used as a result is called xe2x80x9cunpreferable reading operationxe2x80x9d.
According to the invention, there are provided a cache apparatus and a control method, in which a cache miss which is caused by the xe2x80x9cunpreferable invalidating operationxe2x80x9d is reduced and a data waiting time of a processor is shortened.
According to the invention, there are further provided a cache apparatus and a control method, in which an invalidating operation which is necessary due to the xe2x80x9cunpreferable reading operationxe2x80x9d is reduced and a data waiting time of a processor is shortened.
According to the invention, there are provided cache apparatuses which are provided for every plurality of processors, mutually connected by a bus, and connected to a main storage by the bus, each comprising a cache memory in which a part of data in the main storage is held on a cache line unit basis and a status of the data held in the cache line is distinguished by three kinds of statuses
I. Valid
II. Invalid
III. Unknown
and a cache controller for, when a processing request is received from a self processor or another cache apparatus, processing the data held in the relevant cache line in accordance with the status of Valid, Invalid, or Unknown and, after the processing, changing the status of the holding data in the relevant cache line to a predetermined status determined in order to keep the cache coherence. As mentioned above, according to the invention, it is permitted that the data in which whether the validity of the data is valid or invalid is not certain is stored in the cache line as data whose validity is unknown. The problem is solved by processing the xe2x80x9cunpreferable invalidating operationxe2x80x9d by a xe2x80x9cweak invalidating operation (weak-invalidate)xe2x80x9d and by processing the xe2x80x9cunpreferable reading operationxe2x80x9d by a xe2x80x9cweak reading operation (weak-read)xe2x80x9d. The xe2x80x9cweak invalidating operationxe2x80x9d permits that the data in the cache line is continuously held as data whose validity is unknown instead of invalidating the cache line. For example, in the xe2x80x9cunpreferable invalidating operationxe2x80x9d of the MESI protocol for the cache coherence, when a writing request is issued to one of the two cache memory data in the Shared S status, the data in the other cache memory is invalidated. However, in the xe2x80x9cweak invalidating operationxe2x80x9d of the invention, the data is continuously held as data whose validity is unknown without being invalidated. Therefore, although a cache miss occurs by the reading request from the processor according to the xe2x80x9cunpreferable invalidating operationxe2x80x9d, according to the xe2x80x9cweak invalidating operationxe2x80x9d of the invention, since the cache memory holds the data whose validity is unknown, the data can be supplied as speculation data to the processor in response to the reading request. Such a xe2x80x9cweak invalidating operationxe2x80x9d of the invention can be regarded as an MESIU protocol in which the Unknown U status is added to the MESI protocol. In a cache management control according to the MESIU protocol for realizing the xe2x80x9cweak invalidating operationxe2x80x9d of the invention, the xe2x80x9cweak reading operationxe2x80x9d of the invention is executed when a weak reading request which permits the processor to provide the data which has been held in the cache memory and whose validity is unknown as speculation data is issued. In this case, the cache apparatus which received the reading request reads out the holding valid data, transfers it to the other cache apparatus, responds to a requesting source, and copies it to the self cache apparatus. At this time, in the xe2x80x9cweak reading operationxe2x80x9d of the invention, the data transferred from the other cache memory is handled as data whose validity is unknown instead of the valid data. That is, in the xe2x80x9cunpreferable reading operationxe2x80x9d of the MESI protocol, each of the data in the cache memory in the Invalid I status is changed to the Shared S status in the case where it is transferred from the other cache memory in the Exclusive E status. However, in the case where the data is processed by the xe2x80x9cweak reading operationxe2x80x9d of the invention, the data in the other cache memory is not changed but held in the Exclusive E status, and the data transferred to the cache memory as a reading target is held as data whose validity is unknown, namely, as data in the Unknown U status. Therefore, after that, even if a writing request is issued to the other cache memory in the Exclusive E status, the invalidating operation of the cache memory in which the same data is held in the Unknown U status is unnecessary. The xe2x80x9cweak reading operationxe2x80x9d of the invention can be set as a function of the cache apparatus in the cases other than the case where the weak reading request which permits the processor to provide the data which has been held in the cache memory and whose validity is unknown as speculation data is issued.
According to another embodiment of the invention, there are provided cache apparatuses which are provided for every plurality of processors, mutually connected by a bus, and connected to a main storage by the bus, each comprising a cache memory in which a part of data of the main storage is held on a cache line unit basis and a status of the data held in the cache line is distinguished by two kinds of statuses
I. Valid
II. Unknown
and a cache controller for, when a processing request is received from a self processor or another cache apparatus, processing the data held in the relevant cache line in accordance with the status of Valid or Unknown and, after the processing, changing the status of the holding data in the relevant cache line to a predetermined status determined in order to keep the cache coherence.
Also in this case, it is permitted that the data in which whether the validity of the data is valid or not is not certain is stored in the cache line as data whose validity is unknown. The problem is similarly solved by processing the xe2x80x9cunpreferable invalidating operationxe2x80x9d by the xe2x80x9cweak invalidating operation (weak-invalidate)xe2x80x9d and by processing the xe2x80x9cunpreferable reading operationxe2x80x9d by the xe2x80x9cweak reading operation (weak-read)xe2x80x9d. It can be regarded as an MESU protocol in which the Invalid I status is removed from the MESIU protocol of the invention.
When a reading request is issued to the cache line which holds the data whose validity is unknown, the cache controller discriminates whether the providing of the speculation data to the requesting source is permitted or inhibited. When it is determined that the providing is permitted, the data whose validity is unknown is provided to the requesting source as the speculation data. When it is decided that the providing is inhibited, the valid data is obtained from the other cache apparatus or the main storage and provided to the requesting source. After the speculation data is provided to the requesting source, when a validity confirming request of the speculation data is issued from the requesting source, the cache controller confirms a real validity of the speculation data and notifies the requesting source of a result of the confirmation. It is also possible to construct the cache controller in such a manner that the speculation data is provided to the requesting source and, at the same time, the real validity of the speculation data is confirmed, the confirmation result is stored, and the stored confirmation result of the validity is notified to the requesting source when the confirming request of the validity of the speculation data is issued from the requesting source. Further, it is also possible to construct the cache controller in such a manner that the speculation data is provided to the requesting source and, at the same time, the real validity of the speculation data is confirmed by the cache controller itself, and the confirmation result is notified to the requesting source. When it is determined from the confirmation result that the speculation data is valid, the cache controller notifies the processor as a requesting source of the confirmation result and allows the processor to acknowledge an execution result of a command based on the speculation data. When it is determined from the confirmation result that the speculation data is invalid, the cache controller notifies the processor as a requesting source of the confirmation result and allows the processor to abandon the execution result of the command based on the speculation data. Owing to the cache apparatus of the invention which can hold the data whose validity is unknown, the data can be speculatively supplied to the processor. By providing the speculation data, the cache miss is reduced, the waiting time of the processor is shortened, and the cache performance is improved.
When the reading request for the cache line holding the data which has a predetermined specific address in the main storage and whose validity is unknown is issued, the cache controller inhibits to provide the data as speculation data to the requesting source. When the reading request for the cache line holding the data which has an address other than the specific address in the main storage and whose validity is unknown is issued, the cache controller permits that the data is provided as speculation data to the requesting source. When the reading request generated by the execution of the speculation data inhibiting command by the processor is received to the cache line holding the data whose validity is unknown, the cache controller inhibits to provide the data as speculation data to the requesting source. When the reading request generated by the execution of the speculation data permitting command by the processor, for example, as a result of a branch prediction is received to the cache line holding the data whose validity is unknown, the cache controller permits that the data is provided as speculation data to the requesting source. Whether the request is the speculative reading request or not is discriminated from the address in the main storage or from the processor executing command and the cache management control in which the speculative command execution of the processor and the ordinary command execution which is not speculative are distinguished is executed. The cache controller executes the MESIU protocol of the cache coherence such that the status of the data held in the cache line is distinguished by five statuses in which the status U showing that the validity of the data is unknown is added to the MESI protocol of the cache coherence in which the status of the holding data in the cache line is distinguished by four statuses of Modified M indicative of validation, Exclusive E, Shared S, and Invalid I. The MESIU protocol realizes the xe2x80x9cweak invalidating operationxe2x80x9d of the invention as mentioned above.
In the case where the reading request for the holding data in the cache line is processed, the cache controller processes as follows on the basis of the MESIU protocol.
I. If there is the reading request from the self processor in the Modified M status, the status is changed to the same Modified M status. If there is the reading request from the other cache apparatus, the status is changed to the Shared S status.
II. If there is the reading request from the self processor in the Exclusive E status, the status is changed to the same Exclusive E status. If there is the reading request from the other cache apparatus, the status is changed to the Shared S status.
III. If there is the reading request from the self processor or the reading request from the other cache apparatus in the Shared S status, the status is changed to the same Shared S status.
IV. If there is the reading request from the self processor in the Invalid I status and the other cache apparatus does not hold the relevant line, the status is changed to the Exclusive E status. If the other cache apparatus holds the relevant line, the status is changed to the Shared S status. Further, if there is the reading request from the other cache apparatus, the status is changed to the same Invalid I status.
V. If the data is copied from the main storage in response to the reading request from the self processor in the Unknown U status, the status is changed to the Exclusive E status. If the data is copied from the other cache apparatus, the status is changed to the Shared S status. Further, if there is the reading request from the other cache apparatus, the status is changed to the same Unknown U status.
In the case where the cache controller processed the writing request for the holding data in the cache line, it processes the data as follows on the basis of the MESIU protocol.
I. If there is the writing request from the self processor in the Modified M status, the status is changed to the same Modified M status. If there is the writing request from the other cache apparatus, the status is changed to the Unknown U status.
II. If there is the writing request from the self processor in the Exclusive E status, the status is changed to the Modified M status. If there is the writing request from the other cache apparatus, the status is changed to the Unknown U status.
III. If there is the writing request from the self processor in the Shared S status, the status is changed to the Modified M status. If there is the writing request from the other cache apparatus, the status is changed to the Unknown U status.
IV. If there is the writing request from the self processor in the Invalid I status, the status is changed to the Modified M status. If there is the writing request from the other cache apparatus, the status is changed to the same Invalid I status.
V. If there is the writing request from the self processor in the Unknown U status, the status is changed to the Modified M status. If there is the writing request from the other cache apparatus, the status is changed to the same Unknown U status.
In the case where the cache controller processed a weak reading request which permits to provide the data held in the Unknown U status and is issued from the processor, it processes the data as follows on the basis of the MESIU protocol for cache coherence in the weak reading mode.
I. If there is the weak reading request from the self processor in the Modified M status, the status is changed to the same Modified M status. If there is the weak reading request from the other cache apparatus, the status is changed to the Exclusive E status.
II. If there is the weak reading request from the self processor in the Exclusive E status or the weak reading request from the other cache apparatus, the status is changed to the same Exclusive E status.
III. If there is the weak reading request from the self processor or the weak reading request from the other cache apparatus in the Shared S status, the status is changed to the same Shared S status.
IV. If there is the weak reading request from the self processor in the Invalid I status and the other cache apparatus does not hold the relevant line, the status is changed to the Exclusive E status. If the other cache apparatus holds the relevant line, the status is changed to the Unknown U status. Further, if there is the weak reading request from the other cache apparatus, the status is changed to the same Invalid I status.
V. If there is the weak reading request from the self processor or the weak reading request from the other cache apparatus in the Unknown U status, the status is changed to the same Unknown U status.
The MESIU protocol to realize the weak reading request can be also fixedly set by the cache controller without being executed when the weak reading request is received from the processor. The cache controller has the MESU protocol in which the status of the holding data in the cache line is distinguished by four statuses obtained by adding the Unknown U status showing that the validity of the data is unknown to the MES protocol in which the status of the holding data is distinguished by three statuses of Modified indicative of the validity, Exclusive, and Shared.
In the case where the cache controller processed the reading request for the holding data in the cache line, it processes the data as follows on the basis of the MESU protocol.
I. If there is the reading request from the self processor in the Modified M status, the status is changed to the same Modified M status. If there is the reading request from the other cache apparatus, the status is changed to the Shared S status.
II. If there is the reading request from the self processor in the Exclusive E status, the status is changed to the same Exclusive E status. If there is the reading request from the other cache apparatus, the status is changed to the Shared S status.
III. If there is the reading request from the self processor or the reading request from the other cache apparatus in the Shared S status, the status is changed to the same Shared S status.
IV. If there is the reading request from the self processor in the Unknown U status and the other cache apparatus does not hold the relevant line, the status is changed to the Exclusive E status. If the other cache apparatus holds the relevant line, the status is changed to the Shared S status. Further, if there is the reading request from the other cache apparatus, the status is changed to the same Unknown U status.
In the case where the cache controller processed the writing request for the holding data in the cache line, it processes the data as follows on the basis of the MESU protocol.
I. If there is the writing request from the self processor in the Modified M status, the status is changed to the same Modified M status. If there is the writing request from the other cache apparatus, the status is changed to the Unknown U status.
II. If there is the writing request from the self processor in the Exclusive E status, the status is changed to the Modified M status. If there is the writing request from the other cache apparatus, the status is changed to the Unknown U status.
III. If there is the writing request from the self processor in the Shared S status, the status is changed to the Modified M status. If there is the writing request from the other cache apparatus, the status is changed to the Unknown U status.
IV. If there is the writing request from the self processor in the Unknown U status, the status is changed to the Modified M status. If there is the writing request from the other cache apparatus, the status is changed to the same Unknown U status.
In the case where the cache controller processed a weak reading request which permits to provide the speculation data which has been held in the Unknown U status in the cache line and is issued from the processor, it processes the data as follows on the basis of the MESU protocol.
I. If there is the reading request from the self processor in the Modified M status, the status is changed to the same Modified M status. If there is the weak reading request from the other cache apparatus, the status is changed to the Exclusive E status.
II. If there is the reading request from the self processor or the weak reading request from the other cache apparatus in the Exclusive E status, the status is changed to the same Exclusive E status.
III. If there is the reading request from the self processor or the reading request from the other cache apparatus in the Shared S status, the status is changed to the same Shared S status.
IV. If there is the reading request from the self processor in the Unknown U status and the other cache apparatus does not hold the relevant line, the status is changed to the Exclusive E status. If the other cache apparatus holds the relevant line or there is the reading request from the other cache apparatus, the status is changed to the same Unknown U status.
The MESU protocol to realize the weak reading request can be also fixedly set by the cache controller in the cases other than the case of executing it when the weak reading request is received from the processor.
The invention also provides a cache control method whereby cache apparatuses provided for every plurality of processors are mutually connected by the bus and connected to the main storage by the bus.
This cache control method is characterized in that
I. a part of the data in the main storage is held in the cache memory on a cache line unit basis and the status of the data held in the cache line is distinguished by three kinds of statuses of Valid, Invalid, and Unknown,
II. when the processing request is received from the self processor or the other cache apparatus, the data held in the relevant cache line is processed in accordance with the status of Valid, Invalid, or Unknown (MESUI protocol), and
III. the status of the holding data in the relevant cache line is changed to a predetermined status in order to keep the cache coherence after the processing of the request.
According to another aspect of the cache control method of the invention,
I. a part of the data in the main storage is held in the cache memory on a cache line unit basis and the status of the data held in the cache line is distinguished by two kinds of statuses of Valid and Unknown,
II. when the processing request is received from the self processor or the other cache apparatus, the data held in the relevant cache line is processed in accordance with the status of Valid or Unknown (MESU protocol), and
III. the status of the holding data in the relevant cache line is changed to a predetermined status in order to keep the cache coherence after the processing of the request.
The details of the cache control method according to the invention are fundamentally the same as those in case of the apparatus construction.