This invention relates to a management server, and more particularly, to a management server for a service management business.
An operations manager of a computer system performs a service management business (or monitoring business) for monitoring a failure that affects a service operational on the computer system, and an anomaly predictive of the failure. When a failure or anomaly is detected in a monitoring business, the operations manager analyzes the cause of the detected failure or anomaly to take measures thereagainst as necessary in accordance with the analysis result.
The above-mentioned failure and anomaly are hereinafter referred to as “incident” in accordance with the terms of Information Technology Infrastructure Library (ITIL).
The software that supports the above-mentioned monitoring business includes a monitoring tool and an incident management tool.
The monitoring tool is software that supports detection of incidents, and an analysis of the causes for the incidents. The monitoring tool has a first function of communicating to/from the hardware and software of a monitoring subject to collect data indicating the operational statuses of the system.
The data indicating the operational statuses includes data (values) indicative of the performance of the computer system, such as a CPU usage rate, and a log (string of characters) of an application or the like. In recent years, there has been proposed a monitoring tool that collects a wide variety of logs, and permits an operations manager to search the collected logs. Those values and strings of characters are hereinafter generally referred to as “historical data”.
Further, the monitoring tool has a second function of transmitting an alert to the operations manager when historical data satisfies conditions specified in advance. Further, the monitoring tool has a third function of processing historical data into display data, such as a line graph and bar graph, that permits the operations manager to recognize the content of the historical data, and displaying the processed data on the screen of a manager terminal.
The incident management tool is software for managing the contents of past incidents, and measures that have been taken against the past incidents. When a new incident occurs, the operations manager registers the content of the incident into the incident management tool. The operations manager also registers the cause of the incident that has been found in the course of processing the incident, and the measure taken against the incident in the incident management tool. This registration is made in order to permit the operations manager to use the know-how of the past when an incident similar to the past incident occurs in the future.
Because the monitoring tool and the incident management tool are used in combination in many cases, a product which is the integration of the monitoring tool and the incident management tool has also been proposed. Such a product is hereinafter referred to as “service management server”. The service management server is effective in shortening the working time of the operations manager.
For example, the service management server can automatically register an incident in a storage area connected to the service management server based on an alert transmitted by the monitoring tool. Accordingly, the service management server can eliminate the need for the work for the operations manager to register an incident.
Further, for example, the service management server can display data about an incident on the screen, thus leading the operations manager to the screen showing historical data of a monitoring subject where the displayed incident has been detected. Accordingly, the service management server can eliminate the need for the work for the operations manager to retrieve historical data of the monitoring subject where the incident has been detected.
The operations manager needs to view multifarious kinds of historical data to analyze the cause for an incident. Therefore, the monitoring operation using the service management server undesirably takes a longer time to analyze the cause for an incident as the types or number of software and hardware with which one operations manager performs the monitoring operation increase.
In addition, the service management server employing a rule-based technology to analyze the causes for incidents has appeared in recent years. Even when such a rule-based technology is used, however, the operations manager needs to view historical data to verify the correctness of the root cause detected automatically. Therefore, the monitoring work with the service management server has a problem of making the view time of the operations manager longer.
One way of achieving the shortening of the view time is to let a service management server hold, in advance, a procedure manual describing the procedures of the work of an operations manager (including viewing of historical data), and permitting the operations manager to refer to the procedure manual based on the contents of an incident so that the operations manager can grasp the historical data to be checked by the operations manager, and measures to be taken against the incident. This case, however, requires a cost for the operations manager to create the procedure manual in advance.
Further, another technology for achieving the reduction in view time has been proposed that automatically generates procedures for remote maintenance operation based on the status of an incidents and a knowledge DB (for example, see Japanese Patent Application Laid-open No. 2010-224829). The technology disclosed in Japanese Patent Application Laid-open No. 2010-224829 automatically generates some of the procedures based on the status of an incident, thus reducing the cost for generating a procedure manual. However, the technology disclosed in Japanese Patent Application Laid-open No. 2010-224829 cannot generate procedures for an incident the knowledge on which has not been stored in the DB in advance.
A further technology for reducing the view time has been proposed that identifies a past incident similar to an incident that has occurred newly, and provides an operations manager with a measure against the identified past incident (for example, see Japanese Patent Application Laid-open No. 2009-110293). However, with the technology disclosed in Japanese Patent Application Laid-open No. 2009-110293, even when the measure against the past incident is identified, it takes time for the operations manager to interpret what meaning the identified measure has if the identified measure is for a new incident. Further, because the contents of measures against past incidents may contain company secrets, measures against past incidents may not be directly shared in monitoring operations among different companies.
By way of contrast, the related-art recommendation technology applied to a Web site or the like can calculate the deviation of the number of accesses to each piece of data and recommend the user the access that is to be carried out frequently. Accordingly, a technology for reducing the view time has further been proposed that applies such a recommendation technology to a service management server to shorten the time for viewing historical data without the need for an operations manager to create a procedure manual in advance (for example, see Japanese Patent Application Laid-open No. 2011-108034). The technology disclosed in Japanese Patent Application Laid-open No. 2011-108034 concerns a technology for recommending a Web page based on an access log for Web pages having a plurality of attributes.
Historical data has a monitoring subject, a monitoring item (type of a value included in the historical data), or a date and time or the like as an attribute, and hence the service management server can process historical data in the same way as the Web page is processed. For a service management server, however, historical data to be accessed by an operations manager varies depending on the content of an incident, and hence intensive access to specific historical data is not likely to occur frequently. Therefore, the related-art recommendation technology, if applied to an access log for a service management server, may not recommend adequate measures.