The present invention relates in general to event monitoring, and more particularly, to combining event information and symptom knowledge for root cause analysis and for problem prevention and correction. The present invention further relates to presenting event information and symptom knowledge to an operator.
An event may be used to represent a change in the state of a hardware or software component of a business system. An event may also represent a change in the status of information or knowledge of the business system that could impact the operation or processing of the system or a subset of the system. As a few simplified yet illustrative examples, an event could represent a storage device that has run out of available memory or a computer or hardware device that has become disconnected from a network. An event could also report the performance of a web-based business process via a systems management monitor that is monitoring amount of free CPU cycles available on a server, or an event could represent knowledge of a change in the status of information such as information related to a branch or department within the business, or change in information related to a customer, client, business partner, supplier or other source that interacts with, is relied upon, or is otherwise considered by the business.
Events that affect the operation of the business system need to be managed to ensure that the system operates at a satisfactory level. Accordingly, event monitoring software is available, which typically provides filtering and reporting capabilities to depict activity within the enterprise system. However, a human operator is responsible for the analysis and reaction to problems associated to the reported events. The task of monitoring events increases in complexity as the volume and sources of the events increases. Often, a combination of multiple events reveals more complex problems in the system, and human analysis becomes a hard and cumbersome task. Unfortunately, many businesses do not comprise the human expert knowledge required to associate reported events to root causes in an efficient manner. Rather, response to events may be limited to problem management and addressing of incidents after the problem has occurred, leading to the inefficient leveraging of enterprise resources.