1. Field of the Invention
The present invention relates to a cache apparatus and a control method for managing a cache memory by a multiprocessor system and, more particularly, to a cache apparatus and a control method for increasing access performance by controlling various statuses or states of cache data.
2. Description of the Related Arts
In association with a recent demand for a high processing speed of a computer system, in a multiprocessor system, each CPU (processor) has a cache apparatus. Data in a cache provided for each CPU is managed on the cache memory every block in accordance with a rule called a cache coherence protocol to maintain matching of the data among the caches in order to maintain correctness of the data, namely, sharing and consistency of data among the caches.
As a conventional general cache protocol, an MESI cache protocol for managing four statuses or states of MESI has been known. FIGS. 1A and 1B are status transition diagrams of the conventional MESI cache protocol. FIG. 1A is a fetching protocol in case of a fetching request (reading request). FIG. 1B is a storing protocol in case of a storing request (writing request). Symbols of the status transition denote the following contents.
M: Modified. Valid data is held only in one of a plurality of caches. Data has been modified. It is not guaranteed that a value of the data is the same as that in a main storage.
E: Exclusive. Valid data is held only in one of a plurality of caches.
S: Shared. The same data is held in a plurality of caches.
I: Invalid. Data in the cache is invalid.
In the cache control using the conventional MESI cache protocol as mentioned above, if a certain CPU issues a fetching request by which a CPU of the other system refers to a data block stored in a cache apparatus, it is necessary to write the data block into the main storage MS. It requires an amount of access time corresponding to the time for such a writing operation.
FIGS. 2A to 2E show a multiprocessor system comprising CPUs 100-1 to 100-3 and cache apparatuses 102-1 to 102-3 and relate to a case where the CPU 100-2 issues a fetching request and a storing request to the same data block. FIGS. 2A and 2B first show a fetching process of the CPU 100-2. First in FIG. 2A, the CPU 100-2 issues the fetching request to the cache apparatus 102-2 of the self system. However, since the status of the data block denotes Invalid I, the CPU 100-2 issues the fetching request through a bus to the cache apparatus 102-1 in which the data block has been held in the status of Modified M. In FIG. 2B, the data block is obtained and, in response to the CPU 100-2, the status of the data block in the cache apparatus 102-2 is changed from Invalid I to Shared S. The status of the data block in the cache apparatus 102-1 on the data fetching destination side is switched from Modified M to Shared S. Further, the data block which received the fetching request is written into the main storage MS.
FIGS. 2C to 2E show a case where the CPU 100-2 subsequently issues the storing request to the same data block. As shown in FIG. 2C, when the CPU 100-2 issues the storing request to the self system cache apparatus, a block changing request is issued to the cache apparatus 102-1 of the other system holding the same data block in the status of Shared S. As shown in FIG. 2D, the status of the data block is switched from Shared S to Invalid I. The CPU 100-2 performs a storing process to store new data into a data block in the cache apparatus. When the storing process is finished, as shown in FIG. 2E, the completion of the storage is notified from the cache apparatus 102-2 to the CPU 100-2. The process is finished.
FIGS. 3A to 3D show a case where the CPUs 100-2 and 100-3 successively issue the fetching request to the same data block. First, the fetching process of the CPU 100-2 in FIGS. 3A and 3B is the same as that in FIGS. 2A and 2B. It is necessary to write the data block into the main storage MS in response to the fetching request to the cache apparatus 102-1. FIGS. 3C and 3D show the fetching process of the CPU 100-3. The data block is obtained by the fetching request to the cache apparatus 102-2 of the other system. In FIG. 3D, the completion of the fetch is notified to the CPU 100-3.
In such a conventional MESI cache protocol, however, as shown in FIGS. 2B and 3B, when the data block is obtained by issuing the fetching request to the cache apparatus of the other system, the writing operation into the main storage MS is performed in the cache apparatus 102-1 of the other system. As shown in FIG. 2C, when the CPU 100-2 allows the data block held in the status of Shared S to be stored in the cache apparatuses 102-1 and 102-2 of the self and other systems, it is necessary to switch the status of the data block from Shared S to Invalid I by issuing the block changing request (status changing request) to the cache apparatus 100-1 of the other system.
FIG. 4A is another status or state transition diagram of the fetching protocol in the conventional MESI cache protocol. Also in this case, when the data block in the status of Modified M is switched to Invalid I in response to the fetching request from the other system, the writing operation into the main storage MS is performed. FIG. 4B shows another status or state transition diagram of the fetching protocol in the conventional SEMI cache protocol. In this case, when the data block in the status of Modified M is switched to Invalid I in response to the fetching request from the other system, it is not switched to Shared S even if the fetching request is issued after that, thereby making the writing operation into the main storage MS unnecessary even if the data block is switched from Modified M to Invalid I in response to the fetching request of the other system.
To make the writing operation into the main storage MS which is performed at the time of the fetching process of the MESI cache protocol in FIG. 2A or 3A unnecessary, in JP-A-6-124240 (Japanese Patent Application No. 4-275825), as shown in FIGS. 5A and 5B, the protocol is constructed so as to show five statuses obtained by adding Shared Modified O to the MESI cache protocol, thereby making the writing operation into the main storage MS at the time of the fetching process unnecessary. Reference symbol O of Shared Modified denotes Owner.
FIGS. 6A to 6E show the multiprocessor system comprising the CPUs 100-1 to 100-3 and cache apparatuses 102-1 to 102-3 and relate to the case where the CPU 100-2 issues the fetching request and the storing request to the same data block. First, FIGS. 6A and 6B show the fetching process of the CPU 100-2.
In FIG. 6A, the CPU 100-2 issues the fetching request to the self system cache apparatus 102-2. Since the data block is in the status of Invalid I, the fetching request is issued via the bus to the cache apparatus 102-1 holding the data block in the status of Modified M. In FIG. 6B, the data block is obtained, a response is made to the CPU 100-2, and the status of the data block of the cache apparatus 102-2 is changed from Invalid I to Shared S. In this instance, the status of the data block of the cache apparatus 102-1 on the data fetching destination side is switched from Modified M to Shared Modified O. In this case, since it is not switched to Shared S, the writing operation into the main storage MS of the data block which received the fetching request is unnecessary. FIGS. 6C to 6E show the case where the CPU 100-2 issues the storing request to the same data block. As shown in FIG. 6C, when the CPU 100-2 issues the storing request to the self system cache apparatus 102-2, the block changing request (status changing request) is issued to the other system cache apparatus 100-1 holding the same data block in the status of Shared Modified O. As shown in FIG. 6D, therefore, the status of the data block is switched from Shared Modified O to Invalid I and the CPU 100-2 performs the storing process to store new data into the data block of the cache apparatus. When the storing process is finished, as shown in FIG. 6E, the completion of the storage is notified from the cache apparatus 102-2 to the CPU 100-2 and the process is finished.
FIGS. 7A to 7D show the case where the fetching request is successively issued from the CPU 100-2 and the CPU 100-3 to the same data block in this order. The fetching process of the CPU 100-2 in FIGS. 7A and 7B is the same as that in FIGS. 6A and 6B. Since the status is switched from Modified M to Shared Modified O in response to the fetching request to the cache apparatus 102-1, the writing operation into the main storage MS is unnecessary. FIGS. 7C and 7D show the fetching process of the CPU 100-3 of the other system, which is subsequently performed. The data block is fetched by the fetching request to the cache apparatus 102-2 and the completion of the fetch is notified to the CPU 100-3 in FIG. 7D. As mentioned above, in the SEMIO cache protocol having five statuses in FIGS. 5A and 5B, Shared Modified O positioning at the center between Modified M and Shared S is newly provided and stairway performance is provided for Shared. Accordingly, even if there is the fetching request of the other system to the data block in the status of Modified M, the writing operation into the main storage MS is unnecessary. In such a 5-status cache protocol, the writing operation into the main storage MS is needed only in the case where the data block in the status of Modified M or Shared Modified O is replaced in response to the storing request of the self system CPU.
Even in such a cache protocol having five statuses, however, as shown in FIG. 6C, when the storing process of the other system is performed subsequently to the self system fetching process, the status changing request to switch the status of the other system from the status of Shared Modified O to Invalid I, is needed. It takes an amount of time for the storing process corresponding to the time for the execution of the status changing request to the other system. Since a frequency of executing the storing process in the self system to the data block on the cache obtained by the fetching request of the self system is high, it would be desirable to devise an improved cache coherency protocol for improving the access performance of the whole apparatus.
According to the invention, a cache apparatus and a control method in which accessing performance to a cache can be raised by reducing requests for a status change to other systems by finely dividing a status to manage a data block on the cache are provided.
The invention provides a cache apparatus each of which is provided within a microprocessor system having a storage device. The cache apparatus of the invention comprises a cache memory and a cache controller. The cache memory stores data from the storage device and information indicative of a status of the data. The cache controller controls the cache memory in a Writable Modified state for allowing sharing performance to be possessed step by step in a case where there is a fetching request. The cache controller expresses the data state by one of the following six states:
Invalid state;
Shared state;
Exclusive state;
Modified state;
Shared Modified state; and
Writable Modified state; in, for example, a fetching protocol and controls the cache memory.
As a self apparatus fetching process for forming the Writable Modified state, in the case where the cache apparatus obtains data in a Modified state from another cache apparatus in response to the fetching request from a CPU of the self apparatus to data in an Invalid state, the cache controller of the cache apparatus changes the Invalid state of the obtained data to the Writable Modified state. At the same time, when a fetching request is received from the other apparatus with respect to the data in the Modified state, the self apparatus switches the state of the data from the Modified state to the Invalid state. By such conditions, the sixth Writable Modified state added in the invention is formed. As a self apparatus storing process after the Writable Modified state is formed, the cache controller of the cache apparatus stores the data in response to a storing request to the data in the Writable Modified state from the self apparatus CPU and, thereafter, switches the Writable Modified state of the data to the Modified state without needing to modify the other apparatus of the state change. The cache apparatus of the invention as mentioned above finely divides the state of the data to be managed on the cache, thereby reducing the number of state changing requests to the other apparatus and, consequently, raising the accessing speed to the cache apparatus.
In a multiprocessor system having a cache apparatus of a store-in system, it is known that in the case where the CPU of the other apparatus fetches the data stored by a certain CPU, a probability that the CPU of the other apparatus stores the data into the same data is high. It is also known that in the case where the CPU of the other apparatus fetches without storing the data into such data, the Shared state continues. This can be also presumed from the fact that in the case where a plurality of CPUs refer to the same data, if this data lies in a mere reference region, the storing operation is not performed. If it lies in a control region such that the data storage is performed, there is a possibility that the storing operation is also executed by the CPU of the other apparatus. By using such a nature, according to the invention, a 6-state construction in which a Writable Modified state is further added to the conventional five states, is formed and the data of the Writable Modified state is enabled to solely store the data without sharing it with the other apparatus. In the case where the CPU issues the storing request subsequently to the fetching request, therefore, according to the cache protocol of five states, although it is necessary to request a state change to the other apparatus, according to the cache protocol of six states of the invention, the data can be stored without needing to modify the other apparatus of the state. As bits which are necessary to express six states, they can be expressed by three bits in a manner similar to the case of five states. Also in the case where there is not storing request but there is a fetching request from the other apparatus, it is possible to cope with such a case in a manner similar to the case of the cache protocol of five states, without increasing the number of requesting times.
The cache controller of the cache apparatus executes either of the two following different processes as a self apparatus fetching process after the Writable Modified state is formed.
(First Self Apparatus Fetching Process)
As a first self apparatus fetching process, in the case where a fetching request is received to the data in the Writable Modified state from the other apparatus in which the data is in the Invalid state, the cache controller of the cache apparatus switches the Writable Modified state of the data to the Shared Modified state (there is no transfer of the owner). In the case where the data in the Writable Solidified state is obtained from the other apparatus in response to the fetching request to the data in the Invalid state from the CPU of the self apparatus, the cache controller of the cache apparatus switches the Invalid state of the obtained data to the Shared state. In this case, if the fetching request is received for the data in the Shared Modified state from the CPU of the self apparatus, the cache controller of the cache apparatus maintains the Shared Modified state and does not transfer the owner.
(Second Self Apparatus Fetching Process)
As a second self apparatus fetching process, in the case where a fetching request is received for the data in the Writable Modified state from the other apparatus in which the data is in the Invalid state, the cache controller of the cache apparatus switches the Writable Modified state of the data to the Shared state. In the case where the data in the Writable Modified state is obtained from the other apparatus in response to the fetching request for the data in the Invalid state from the CPU of the self apparatus, the cache controller of the cache apparatus switches the Invalid state of the obtained data to the Shared Modified state (transfer of the owner). In this case, if the fetching request is received for the data in the Shared Modified state from the CPU of the self apparatus, the cache controller of the cache apparatus maintains the Shared Modified state. In the case where the fetching request is received for the data in the Shared Modified state from the other apparatus, the cache controller switches the Shared Modified state of the data to the Shared state (transfer of the owner). Further, in the case where the data in the Shared Modified state is obtained from the other apparatus in response to the fetching request from the CPU of the self apparatus for the data in the Invalid state, the cache controller switches the Invalid state of the obtained data to the Shared Modified state.
As a storing process of the Shared Modified state corresponding to the first and second self apparatus fetching processes, in the case where a storing request is received for the data in the Shared Modified state from the CPU of the self apparatus, the cache controller of the cache apparatus stores the data and thereafter, switches the Shared Modified state of the data to the Modified state. In the case where a state changing request based on the storing request from the other apparatus is received for the data in the Shared Modified state, the cache controller switches the Shared Modified stated of the data to the Invalid state.
The invention provides a method of controlling a cache memory. The control method comprises the operations of: storing data from a storage device into the cache memory information indicative of a status of the data; and controlling the cache memory in a Writable Modified state for allowing sharing performance to be possessed step by step in a case where there is a fetching request.
For example, in a fetching protocol, the state of data is expressed by six states including the Writable Modified state for allowing the sharing performance to be possessed step by step in a case where there is a fetching request, in addition to the Invalid state, Shared state, Exclusive state, Modified state, and Shared Modified state, thereby controlling the cache memory.
As a fetching process for the Shared Modified state formed by the first fetching process, in the case where the fetching request is received for the data in the Shared Modified state from the CPU of the self apparatus or the other apparatus, the Shared Modified state is maintained. As a second fetching process after the Writable Modified state is formed, in the case where a certain cache apparatus receives the fetching request for the data in the Writable Modified state from the other apparatus in which the data is in the Invalid state, the cache apparatus switches the Writable Modified state of the data to the Shared state. The other apparatus switches the Invalid state of the obtained data to the Shared Modified state. As a fetching process for the Shared Modified state formed by the second fetching process, in the case where the fetching request is received for the data in the Shared Modified state from the CPU of the self apparatus, the Shared Modified state is maintained. In the case where the fetching request is received for the data in the Shared Modified state from the other apparatus, the Shared Modified state of the data is switched to the Shared state. As a storing process corresponding to the first and second fetching processes, in the case where the storing request is received for the data in the Shared Modified state from the CPU of the self apparatus, the data is stored and, thereafter, the Shared Modified state of the data is switched to the Modified state. In the case where the state change notification based on the storing request is received for the data in the Shared Modified state from the other apparatus, the Shared Modified state of the data is switched to the Invalid State.
Further, according to the invention, there is provided a multiprocessor system comprising: a storage device; a plurality of processing modules connected to the storage device, each processing module having: a cache memory for storing data from the storage device and information indicative of a status of the data; and a cache controller for controlling the cache memory in a Writable Modified state for allowing sharing performance to be possessed step by step in a case where there is a fetching request.