There has been developed a system that automatically performs management and the like of the system by using a policy that describes rules for system management so as to reduce the management costs and respond to an abnormality. Typically, a policy includes a triggering condition and a triggered operation. When a given event is generated inside or outside of a management-targeted system, a policy handling system that handles policies retrieves a policy having a triggering condition which is satisfied by the generated event and executes the triggered operation of the retrieved policy.
Assuming that the policy is appropriately set, the policy handling system can achieve autonomous and automatic management for the management-targeted system by executing the operation of the policy. However, an inappropriate setting of the policy may not only prevent autonomous processing from being properly executed but also cause an abnormality in the management-targeted system. Further, if a consistency between policies is not achieved even if each single policy is appropriately set, an abnormality may be caused in the management-targeted system by interaction between the policies. Therefore, during performing the system management using the policy, it is necessary to monitor whether or not a failure is caused by each policy or interaction between the policies.
Examples of the known technique for detecting an error of policy setting, or inconsistency or contradiction between the policies include one described in JP2004-303190A. FIG. 39 shows the configuration of an information processing apparatus (policy handling system) 500 described in this Patent Document 1. The technique described in this publication uses a policy database 504 for storing normal policies and a global-policy database 507 for storing upper-level policies (global policies) specifying events or the like that are not allowed to occur.
The policy database 504 stores therein a normal policy including a condition and an operation specifying, e.g., “to backup files of server A at 3:00 AM”. A policy retrieving section 502 retrieves a policy having triggering condition which is satisfied by an event received by an event receiving section 501 and passes the retrieved policy to an operation executing section 503. The operation executing section 503 executes the operation specified in the policy for a management-targeted system.
The global-policy database 507 stores therein a global policy imposing restrictions on the operation of the policy, such as “plurality of backups are not allowed to be created simultaneously on one tape device”. A global-policy comparison section 506 checks whether or not a policy stored in the policy database 504 conflicts with a global policy stored in the global-policy database 507. A policy correcting section 505 corrects the policy conflicting with the global policy so as not to conflict with the global policy.
It is assumed that, for example, the policy database 504 stores therein policies specifying “to backup files of server A at 3:00 AM” and “to back up files of server B at 3:00 AM”. In this case, each policy itself does not have any problem. However, when considering that the operation executing section 503 executes the above policies, which means that the operation executing section 503 simultaneously executes backups of files of servers A and B at 3:00 AM, it is found that the policies conflict with the global policy specifying “plurality of backups are not allowed to be created simultaneously on one tape device”. The policy correcting section 505 modifies, e.g., the content of the policy “to back up files of server B at 3:00 AM” into “to back up files of server B at 2:00 AM”.
In the technique described in the above publication, the policy correcting section 505 automatically corrects the parameter of the policy conflicting with the global policy. Thus, the operation described in the policy allows autonomous processing to operate normally. As described above, in the technique described in the above publication, it is possible to prevent execution of a policy conflicting with the global policy, thereby achieving autonomous and automatic management for a management-targeted system.
However, in the technique described in the above publication, the policy is analyzed, at the time of registration thereof, whether or not the policy conflicts with the global policy according to the global policy stored in the global-policy database 507. Therefore, it is impossible to cope with a situation in which the policy causes a chain of troubles in association with the management-targeted system, such as occurring due to the system operation or unexpected change in the state of policies, which cannot be grasped only from the described policies. For example, it is impossible to cope with the situation in which, if one resource is dynamically assigned to a service of a heavier load, two services scramble for the one resource to cause the one resource to alternate between the services.
In order to detect and solve the problems involved with the policy operation, it is necessary to monitor the actual operating state of the policy. However, in the technique of Patent Document 1, the operating state of the policy is not monitored after the triggering of the policy. Thus, it is impossible to detect a negative spiral in which an unintended change occurs in the system state by the policy operation due to the inconsistency between the system operation and the policy description or between the descriptions of a plurality of policies, then the unintended change triggers the same policy once again, and the thus triggered policy worsens the system state.
Further, in the technique described in the above publication, there is provided no mechanism for responding to a failure caused by the triggering of the policy in real time so as to minimize the spread of the failure. Therefore, the failure may spread while a human is considering a countermeasure for the failure if human judgment is required for determining whether or not the management-targeted system should be stopped. Thus, even if an abnormal policy which seems to cause the negative spiral can be detected, an administrator cannot afford to stop the system because the policy operation is automatically executed at a high speed, with the result that it is impossible to prevent the spread of failure.