Conventionally, Non Uniform Memory Access (NUMA) technology is known in which a plurality of pairs each having a memory as a main storage unit and a Central Processing Unit (CPU) as an processor for managing data on the memory are provided and in which the CPUs share each of the memories. As an example of such NUMA technology, Cache Coherent Non Uniform Memory Access (ccNUMA) technology is known in which each of the CPUs uses a directory to maintain coherency between data on the memory connected to its own CPU and data held by each of the CPUs.
A CPU to which the ccNUMA technology is applied may cause, when other CPU holds the data on the memory managed by the CPU and data transfer is requested from another CPU, the other CPU that holds the data to transfer the data. A data transfer process performed by the CPU to which the ccNUMA technology is applied will be explained below with reference to FIG. 29.
FIG. 29 is a diagram for explaining a request transfer process performed by a conventional CPU. In the description below, a CPU that manages coherency of transfer target data is called Home (H)-CPU, and a CPU that issues a request for requesting data transfer is called Local (L)-CPU. A CPU that already holds the transfer target data from the memory managed by the H-CPU is called Remote (R)-CPU.
First of all, as illustrated in (A) of FIG. 29, the L-CPU issues a request for requesting data transfer to the H-CPU. Then, the H-CPU checks a directory state of a memory address where the transfer target data is stored. The H-CPU then determines that the R-CPU holds the latest data and issues a data transfer request to the R-CPU as illustrated in (B) of FIG. 29.
Meanwhile, when receiving the data transfer request from the H-CPU, as illustrated in (C) of FIG. 29, the R-CPU transmits the transfer target data to the L-CPU. As illustrated in (D) of FIG. 29, the R-CPU also transmits a data transfer response indicating a current cache state of the transfer target data to the H-CPU.
When receiving the data transfer response from the R-CPU, the H-CPU determines the directory state and the cache state when the L-CPU holds the data based on the current cache state indicated by the data transfer response. As illustrated in (E) of FIG. 29, the H-CPU transmits a request response indicating acquisition of data ownership and a new cache state to the L-CPU. Thereafter, when receiving the request response from the H-CPU, the L-CPU performs the process using the data received from the R-CPU according to the new cache state indicated by the request response.    Patent Literature 1: Japanese Laid-open Patent Publication No. 2010-198490    Non-patent Literature 1: John L. Hennessy, David A. Patterson, “Computer Architecture: A Quantitative Approach” 4th Edition, pp. 230-237
However, in the technology in which the three CPUs transfer data, the H-CPU receives the data transfer response from the R-CPU and then transmits the request response indicating the acquisition of data ownership to the L-CPU. Because of this, it takes time for completion of the data transfer, which results in degradation of data transfer performance.