This invention relates in general to classification and analysis of events within a computer or computer network, and in particular to methods and systems for performing preliminary cause-based classification of events to facilitate cause-based analysis.
As computer networks become increasingly important and complex, so do the tasks of monitoring, evaluating, maintaining, managing, and trouble-shooting in connection with network-associated functioning and communications.
Computer networks can generally be described as including nodes, with various interconnections allowing messaging, communications, or interaction between nodes. Nodes can be or include a wide variety of physical or conceptual components, such as computers, servers, clients, routers, hubs, switches, bridges, software, applications, programming modules, and other components. The route of a communication from a source node to a destination node may depend on a variety of factors, such as network configuration, efficiency, traffic, hardware or software availability, etc.
Certain events or conditions occurring or existing within a network, (which together may be broadly referred to as events), may trigger or cause event messages to be generated and stored relating to the events or conditions. The event messages may relate to any of a variety of situations or conditions, including, for example, status messages, error messages, alert messages, alarm messages, informational messages, and others. The messages may contain various pieces, sets, or fields of information relating to the event, such as, for example, where (physically or conceptually), or in connection with what communication, the event occurred, when the event occurred, information concerning the nature of the event, circumstances associated with the event, parameters associated with the event, and other information.
Often, it is desired or required to perform cause-based analysis with regard to an event. Cause-based analysis can include root cause analysis and determining cause-based relevancy relating to the event or related or associated events. For example, cause-based analysis may be used to determine or attempt to determine whether the event was caused by another event or condition, whether the event caused another event or condition, whether the event is a root cause or cause of a series or chain of events or conditions, or other cause-related information. Such cause-based analysis may be useful or necessary to determine the appropriate action or conduct to take with respect to the computer network in light of the event or events. For instance, if an event message indicates a problem in the network, cause-based analysis may be needed to determine or facilitate determination of what corrective, remedial, or proactive measures may need to be taken to correct the problem, cause or causes of the problem, or symptom or symptoms of the problem, or to prevent further related problems.
Computer network management methods and systems are known in the art. “Integrated Network Management VII: Integrated Management Strategies for the New Millennium”, 2001 IEEE/IFIP International Symposium on Integrated Network Management Proceedings, edited by George Pavlou, Nikos Anerousis, and Antonio Liotta, discusses network management strategies including fault analysis and management.
“Expert Systems Applications in Integrated Network Management”, edited by Eric C. Ericson, Lisa Traeger Ericson and Daniel Minoli, discusses network management including alarm monitoring and analysis.
U.S. Pat. No. 5,392,328, issued on Feb. 21, 1995, and entitled, “System and Method for Automatically Detecting Root Causes of Switching Connection Failures in a Telephone Network” discusses root cause detection and error message processing.
U.S. Pat. No. 6,012,152, issued on Jan. 4, 2000, and entitled, “Software Fault Management System” discusses fault management in a mobile telecommunications network.
FIG. 1 depicts a prior art computer network 100. The network 100 includes a number of nodes, including a source node 102 from which a data packet 106 is sent and a destination node 104 which is the ultimate destination of the data packet 106. The data packet 106 traverses a route through nodes of the network 100 as depicted by interconnections 108, 110, 112, 114, and 116. As a consequence of some event (which can be an event or condition) occurring or existing in the network 100, event messages E1, E2, and E3 are stored at nodes 118, 120, and 122. Event information 124, which may include event messages E1, E2 and E3, is collected and communicated to a database for later use and analysis.
Cause-based analysis, such as root cause analysis or cause relevancy determination, may be performed with respect to each of the events indicated by the event messages E1, E2, and E3. In many instances, the faster, more efficient, or more accurate the cause-based analysis or determination, the faster and more effectively appropriate responsive or corrective action or conduct may be taken. For example, if the event messages are alert or error messages, then the faster, more efficiently, or more accurately cause-based analysis is performed relating to the event messages, the faster and more effectively any remedial, corrective, or proactive action may be identified and taken. Furthermore, timely and appropriate action may greatly limit negative or costly consequences of problems, such as by allowing timely re-establishment of effective communications or interaction between network nodes. Additionally, more efficient cause-based analysis means that less time, computing power and resources need to be expended to perform the cause-based analysis.
There is a need for methods and systems to facilitate cause-based analysis in a networked computer system.