In recent years, information processing systems, each of which has plural information processing apparatuses connected via a crossbar switch or the like, have been used. Each of the information processing apparatuses has plural central processing units (CPU), a memory, a hard disk drive (HDD), and the like, and performs communication with the other information processing apparatuses via the crossbar switch or the like. Memories that each of the information processing apparatuses has include: a local memory that only that particular information processing apparatus is able to access; and a shared memory that the other information processing apparatuses are able to access.
For the shared memory, a technique using access tokens has been developed as a technique for controlling permission for access from the other information processing apparatuses. Each of the information processing apparatuses stores, in a register, a key called a memory token for each unit area having a predetermined size in the shared memory, and permits only any information processing apparatus, which has specified the key as an access token, to access the corresponding unit area. If a failure occurs in any of the other information processing apparatuses using the shared memory, the information processing apparatus having the shared memory stores a new memory token into the register. The information processing apparatus having the shared memory then transmits the new memory token to that other information processing apparatus, in which the failure has occurred. However, because the information processing apparatus, in which the failure has occurred, is unable to receive the new memory token, even if the information processing apparatus attempts to access the shared memory, the memory tokens do not match. Therefore, access to the shared memory from the information processing apparatus, in which the failure has occurred, is able to be prevented.
Further, there is a technique for preventing, with a parallel processing program, a malfunction due to memory destruction, by referring to a page table including identifiers of processor elements that have been permitted to access a physical memory and determining whether or not the physical memory is accessible by a processor element.
Furthermore, there is a technique for realizing access protection of a shared memory, by determining access violation, based on bit patterns indicating possibility and prohibition of read and write, when a processor accesses the shared memory.
[Patent Literature 1] Japanese Laid-open Patent Publication No. 2013-140446
[Patent Literature 2] Japanese Laid-open Patent Publication No. 2011-70528
[Patent Literature 3] Japanese Laid-open Patent Publication No. S59-121561
However, when a failure occurs in a certain information processing apparatus that uses a shared memory, in order to reset the access token, access to the whole shared memory is temporarily stopped. Therefore, there is a problem that even if the other normal information processing apparatuses excluding that information processing apparatus, in which the failure has occurred, wish to access the shared memory, the access is interrupted due to the process of stopping and restarting the access to the whole shared memory.