Conventionally, apparatuses monitor devices that are internally and/or externally connected to an information processing apparatus, such as a watchdog timer and a diagnosis apparatus. Techniques are also present to periodically check the devices and detect abnormalities of the devices by using these apparatuses. According to one technique, a backup system is managed separately from a main system and, the main system and the backup system are switched when an abnormality of a device is detected in the main system.
A technique for the occurrence of an abnormality is disclosed. According to the technique, an additional apparatus of an input and output apparatus is provided of a multi-core processor system that includes plural cores, whereby the number of a started-up central processing unit (CPU) is recorded (see, for example, Japanese Laid-Open Patent Publication No. S55-108026). A technique is disclosed for restoration executed after the occurrence of an abnormality. According to the technique, in controlling a co-processor that aids a CPU, hanging up is detected and the co-processor that hangs up is reset (see, for example, Published Japanese-Translation of PCT Application, Publication No. 2007-507034).
A technique is disclosed to reduce a response time period between devices. According to the technique, a control apparatus is disposed in a shared system of plural CPUs and plural inputs/outputs (I/Os); when a start-up I/O is operating, a control signal is recorded; and after the completion of the operation, a dummy I/O is transmitted (see, for example, Japanese Laid-Open Patent Publication No. H6-208536).
Of the above conventional techniques, the techniques according to Japanese Laid-Open Patent Publication No. S55-108026 and Published Japanese-Translation of PCT Application, Publication No. 2007-507034 are each directed to abnormalities caused by faults of hardware. However, according to the techniques, faults caused by software access of devices are determined as normal states and therefore, a problem arises in that abnormal states are overlooked.
In a multi-core processor system, if an abnormal state is overlooked, the state occurs where application software (hereinafter, referred to as “app”) having a problem stalls retaining an access right for a shared device and another app can not acquire the access right for the shared device. As a result, a problem arises in that the other app unable to acquire the access right also stalls, that is, the consecutive stalls occur. When consecutive stalls occur, another problem arises in that, among the apps that stall, the app that is the software having a problem has to be identified.