The present invention relates to a technique for supporting a work of monitoring events occurring in an information system.
One of the operational tasks of the information system (IT (Information Technology) system) is to monitor events. In critical IT systems leveraged in core business or the like in companies, when a phenomenon such as malfunction or abnormality occurs, an event is issued.
Event is data issued by a program called an agent which is mounted to the IT system. The agent is intended for observing elements configuring the IT system, for example, hardware or software such as OS (operating system) and middleware, observes a performance of an object to be monitored and a state such as lift and death, and acquires log data output from the object to be monitored. When the observed state or the acquired log data corresponds to a specific condition, the agent issues an event indicating occurrence of a phenomenon corresponding to the specific condition. Since the event is data for transmitting the occurring event to a person, the event usually includes character string data representing a place where the event has occurred, the object to be monitored where the event has occurred, and the occurred phenomenon to be read and understandable by a person. This character string data is called an event message. The event issued by the agent is sent to a management computer.
The management computer stores events received from each agent, and centrally manages the stored events. In the management computer, a monitoring operator (human) monitors events received from each agent. The monitoring operator checks the received events one by one, and if the monitoring operator discovers the events that lead to serious disorders, the monitoring operator reports the events to a host manager. That the monitoring operator reports the event to the host manager is referred to as an escalation. The monitoring operator determines whether or not escalation is necessary according to an event handling guide.
A guide indicating how to handle the event is described in the event handling guide for each event. Each guide includes a guide message which is a sample of an event message of the event to be handled and a criterion for determining whether or not the event is escalated.
As a task to be performed by the monitoring operator, when a new event arrives in the management computer, the monitoring operator first searches for a guide that matches the event from the event handing guide. Specifically, the monitoring operator finds out a guide having a guide message close to the content of an event message included in the event with the use of visual inspection of a document, search of the document, or the like. Further, the monitoring operator determines the necessity of escalation according to the determination criteria included in the found guide and performs the escalation as needed.
The event monitoring work includes a series of operations related to the monitoring of the event described above. The event monitoring work is appropriately continued, thereby being capable of discovering a failure of the IT system at an early stage, and appropriately handling the failure. In other words, the monitoring operator needs to always perform the event monitoring work as long as the IT system is operated. For that reason, labor costs of the event monitoring work become very large.
In particular, it is significant costly for the monitoring operator to perform the work of finding out the guide handling the event. If an IT system to be monitored is large-scale, or the number of IT systems to be monitored is larger, the number of guides included in the event handling guide may range from thousands to several tens of thousands, and it may take long time for the work of searching the guide handling the event. An increase in working time not only increases the cost but also becomes a factor of delaying a response to the failure.
For that reason, a technology to support the event monitoring work on the computer has been proposed. US 2014/0324865A1 discloses a technique of comparing an event message with a guide message as a character string and automatically identifying a guide message similar to the event message. The technique of US 2014/0324865A1 compares the character strings of each row output to a log with each other and calculate the proximity of the rows.