There is a known conventionally technology called inter process communication (IPC) used to transmit and receive data used by multiple pieces of software when each piece of the software performs a process in liaison with the multiple pieces of software. A technology for performing the inter process communication using a queue is known as an example technology of inter process communication.
In the following, an example of a technology for performing inter process communication using a queue will be described with reference to FIGS. 40 to 43. FIG. 40 is a schematic diagram illustrating the concept of inter process communication performed by conventional software. For example, in the example illustrated in FIG. 40, each of user processes A to C stores data, which is to be transmitted to a user process D, in a queue in a First In First Out (FIFO) system that is implemented by software. Furthermore, the user process D obtains the data stored in the queue in the FIFO system in the order it arrives.
In the following, a process performed when the user processes A to C transmit messages A to C, respectively, to the user process D by using a queue will be described with reference to FIG. 41. FIG. 41 is a schematic diagram illustrating a process performed as inter process communication using the conventional software. In the example illustrated in FIG. 41, a memory stores therein a base address of a storage area that is used as a queue, a read pointer that indicates a read address of the data, and a write pointer that indicates a write address of the data. In the example described below, it is assumed that each of the initial values of the read pointer and the write pointer is “0x0120”.
For example, the user process A refers to the write pointer and stores a 32-byte message A from the address “0x0120” indicated by the write pointer. Then, the user process A adds “0x0020” to “0x0120”, which is the value of the write pointer, and updates it to “0x0140”. When the user process A updates the write pointer, the user process A issues an atomic instruction, such as a compare-and-swap (CAS) instruction or a fetch-and-add (FAD) instruction, or performs an exclusive access control in which, for example, an exclusive lock of the write pointer is obtained.
Subsequently, similarly to the user process A, the user process B refers to the write pointer; stores the 32-byte message B from the address “0x0140” indicated by the write pointer; and updates the value of the write pointer from “0x0140” to “0x0160”. Furthermore, similarly, the user process C stores the message C in the address “0x0160” indicated by the write pointer and updates the value of the write pointer from “0x0160” to “0x0180”.
In contrast, the user process D determines, asynchronously to the user processes A to C, whether a value of the read pointer matches a value of the write pointer. If both of the values do not match, the user process D determines that a new message is stored in the queue. If the user process D determines that a new message is stored in the queue, the user process D reads the message from the address indicated by the read pointer.
For example, because the value of the read pointer is “0x0120” and the value of the write pointer is “0x0160”, the user process D reads the message A from “0x0120” and updates the value of the read pointer from “0x0120” to “0x0140”. By repeating this process until the values of the read pointer and the write pointer match, the user process D reads each of the messages A to C stored in the queue.
Furthermore, a technology for a multi node system in which multiple CPUs perform different processes is known. An information processing system is known as an example of such a multi node system, which includes multiple central processing units (CPUs) that cache data with each CPU simultaneously performing different processes. Furthermore, a technology for a shared memory system is also known in which each CPU executes an independent OS and a part of memory area is shared between the CPUs. With this configuration, performance can be improved, and furthermore, because each OS individually operates in each node, an error can be prevented; therefore, it is possible to improve availability of the system.
FIG. 42 is a schematic diagram illustrating the concept of a multi node system using a shared memory. As illustrated in FIG. 42, the information processing system includes multiple nodes #0 to #3 that include CPUs #0 to #3, respectively. Each of the nodes #0 to #3 includes a local memory, hypervisor (HPV) software, an operating system (OS), and a device driver and simultaneously performs different user processes A to D, respectively. The HPV software is software that manages a virtual machine operated by each of the nodes #0 to #3. The information processing system described above implements a queue by storing the write pointer and the read pointer in a shared memory that is shared by each of the nodes #0 to #3 and performs the inter process communication between the user processes A to D.
In the following, an example of a process performed by each of the CPUs #0 to #3 when the user processes A to C transmit messages A to C, respectively, to the user process D will be described with reference to FIG. 43. FIG. 43 is a schematic diagram illustrating a process in which a write pointer is cached by each node. For example, the CPU #0 that executes the user process A caches the write pointer in a shared memory and stores therein the message A from the address “0x0120” indicated by the write pointer (in (1) of FIG. 43). Furthermore, the CPU #0 updates the value of the cached write pointer to “0x0140” (in (2) of FIG. 43) and stores information indicating that the cache line of the write pointer is in an updated state (i.e., modify).
Subsequently, because the cache line of the write pointer is in the updated state, a CPU #1 that executes the user process B caches, from the CPU #0, the write pointer that is updated by the CPU #0 (in (3) of FIG. 43). Then, if the CPU #1 stores the message B from the address “0x0140” indicated by the cached write pointer, the CPU #1 updates the value of the write pointer to “0x0160” (in (4) of FIG. 43). Similarly, a CPU #2 that executes the user process C caches, from the CPU #1, the write pointer that is updated by the CPU #1 (in (5) of FIG. 43) and stores the message C from the address “0x0160” indicated by the cached write pointer. Then, the CPU #2 updates the value of the write pointer to “0x0180” (in (6) of FIG. 43).
At this point, a CPU #3 that executes the user process D caches the read pointer from the shared memory (in (7) of FIG. 43). Furthermore, because the cache line of the write pointer is in the updated state, the CPU #3 caches, from the CPU #2, the write pointer that is updated by the CPU #2 (in (8) FIG. 43). Because the value of the read pointer “0x0120” does not match the value of the write pointer “0x0160”, the CPU #3 reads the message from the address indicated by the read pointer and updates the value of the read pointer. Thereafter, the CPU #3 reads a message and updates the read pointer until the values of the read pointer and the write pointer match and then obtains the messages A to C transmitted by the user processes A to C, respectively.    Patent Document 1: Japanese Patent No. 2703417    Patent Document 2: Japanese Laid-open Patent Publication No. 2003-216592    Patent Document 3: Japanese Laid-open Patent Publication No. 07-200506
However, in the technology for storing a read pointer and a write pointer in a shared memory, a node at the data transmission side caches the write pointer and updates the value of the cached write pointer. Accordingly, if a failure occurs in the node that caches the write pointer, because another node does not obtain the latest value of the write pointer, data transmission is not performed. Accordingly, there is a problem in that a failure is propagated to the other nodes.
In the following, a description will be given of a problem, with reference to FIG. 44, in which a failure is propagated to another node when the failure has occurred in a node in which a write pointer is being cached. FIG. 44 is a schematic diagram illustrating the flow in which a failure is propagated to another node when the failure has occurred in a node.
For example, the CPU #0 caches the write pointer in the shared memory, stores the message A, and updates the value of the cached write pointer to “0x0140”. Subsequently, if the CPU #1 caches, from the CPU #0, the write pointer that is updated by the CPU #0, the CPU #1 stores the message B from the address “0x0140” indicated by the cached write pointer and updates the value of the write pointer to “0x0160”.
In this state, it is assumed that an error has occurred in the node #1 and assumed that the CPU #1 stops abnormally. In order to transmit a message, the CPU #2 attempts to cache the write pointer that has been updated by the CPU #1; however, because the CPU #1 has stopped, the CPU #2 does not cache the write pointer as illustrated in (A) of FIG. 44. Accordingly, the CPU #2 stops abnormally without continuing the process.
Furthermore, the CPU #3 attempts to cache the write pointer in order to determine whether a new message is transmitted. However, because the CPU #1 has stopped, the CPU #3 does not cache the write pointer from the CPU #1 and thus stops abnormally, as illustrated in (B) of FIG. 44. Furthermore, if the CPU #3 stops abnormally, because the process specified by the message A is not performed, the CPU #0 may possibly stop abnormally due to, for example, a time-out.