In the operation of an information processing system, it is sometimes the case that a system administrator of the information processing system determines whether there is a failure in apparatuses, such as servers, storage devices, and communication apparatuses, and takes necessary measures when there is a failure. For example, if a hardware failure is found in an apparatus, the system administrator may stop the apparatus and change the hardware. In addition, if a failure is found in the execution state of software, the system administrator may stop processes of the software and investigate the cause of the failure. Further, if an overload on an apparatus is found, the system administrator may add more resources for information processing.
On the other hand, when the number of apparatuses in the information processing system becomes large, the burden on the system administrator for the monitoring operation is increased. One conceivable way to deal with the burden is for an information processing apparatus for operations management to collect information from monitored target apparatuses and examine the collected information to thereby automatically detect a failure (or a sign of a failure) in an apparatus. When detecting a failure, the information processing apparatus may issue a warning to the system administrator, or may take necessary measures (for example, transmit a stop instruction to an apparatus in a failure state) according to a predetermined processing procedure.
Note that a method has been proposed for determining whether to continue or stop autonomous control by collecting information from management target computers and cross-checking the collected information with stop determination rules in an operations management system which carries out autonomous operation and management of the computers according to a predefined workflow (see Japanese Laid-open Patent Publication No. 2007-4337, paragraphs [0028] and [0030]).
However, an increase in the number of items of information to be collected and examined leads to an increase in the monitoring load. Assume that continuous examination is carried out, with respect to each server, for information on specific items, for example, the status of a hard disk drive (HDD), the status of a memory, and the number of transactions being executed by the server. This causes an increase in the workload of an information processing apparatus for carrying out the examination.