Data mirroring is widely used to prevent data loss. As one example, in some systems equipped with a plurality of information processing apparatuses, the content of a cache memory of one information processing apparatus is mirrored in a cache memory of another information processing apparatus.
One example of a mirroring technology is a proposed storage system where a plurality of storage control apparatuses are each equipped with a local cache and a mirror cache and cache memories are cyclically duplicated by duplicating the local cache of one storage control apparatus in the mirror cache of an adjacent storage control apparatus.
Another proposed technology relating to data protection stops any processing that is running when a power failure is detected for a storage apparatus so as to protect cache data that would be lost due to a loss of power. As one example, cache data may be held for a certain period following a power failure using power supplied from a battery. When a power failure continues beyond the certain period, the cache data can be protected by writing in a nonvolatile memory.
See, for example, the following documents:
International Publication Pamphlet No. WO2004-114115; and
Japanese Laid-Open Patent Publication No. 2014-215661.
However, an information processing apparatus can go down not only due to a power failure but also due to a software error where inconsistent data is inputted into a CPU (Central Processing Unit) that controls the operations of the information processing apparatus. When an apparatus goes down due to a software error, it may be possible to recover the information processing apparatus using a method called “machine recovery” where the information processing apparatus is restarted in a state where the content of the memory is held. In particular, when two information processing apparatuses that both store cache data and mirror data for such cache data go down due to a software error, by having both information processing apparatuses execute a machine recovery, it is possible to quickly recover the system with the cache data in the duplicated state.
On the other hand, there are also systems with a plurality of information processing apparatuses where the respective information processing apparatuses monitor the state of each other and can detect when another information processing apparatus has gone down. However, it is difficult to specify the cause of the other information processing apparatus going down. When an information processing apparatus system that went down due to a power failure is instructed to execute machine recovery as described above, such machine recovery will not be executed. In this situation, cache data and mirror data that were stored in the memory of the information processing apparatus that went down are lost, resulting in a potential loss of reliability for the operations of such information processing apparatus.