The present invention relates to an operation control system for monitoring the operational state of a system. More particularly, the present invention relates to a technique for obtaining operation performance data from a monitored object in order to monitor the operational state of the system.
According to a prior art technique, an operation control system periodically obtains various type of operation performance data from monitored computers by use of a control computer to monitor the operational state of the network system. The obtained operation performance data is displayed on the display of the control computer and used by the manager to execute pattern analysis on the operational state of the network system and failure analysis.
To reduce the network load occurring when operation performance data is collected from a monitored object, Japanese Laid-Open Patent Publication No. 11-234274 discloses a technique for performing failure analysis by use of the monitored server.
However, the control system disclosed in the above Japanese Laid-Open Patent Publication does not change the number and the types of monitored items (e.g., CPU usage rate, memory usage rate, etc.) after it is determined that the operational state of the system has become risky based on the operation performance (metric) value of a specific monitored item.
On the other hand, the manager determines the degree of risk involved with the operational state of the system and the risk factors by checking the operation performance value of a specific monitored item whose operation performance value is within a risk range set based on a certain threshold value and the operation performance values of its related monitored items. Thus, the monitored items used to actually monitor the operational state of the system are limited to those whose operation performance value is within the risk range and their related monitored items.
The control system disclosed in the above Japanese Patent Laid-Open Publication obtains data of all predetermined, fixed monitored items, which increases both the capacity of the memory for storing the operation performance data and the use of the network (communication line) for transmitting/receiving the operation performance data and unduly reduces the processing performance of the CPU of the monitored computer, causing the problem of reduced processing performance for ordinary services.