The approaches described in this section could be pursued, but are not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated herein, the approaches described in this section are not prior art to the claims in this application and are not admitted to be prior art by inclusion in this section.
Some network management systems implement a data model of the managed network, in which programmatic objects represent network elements such as routers and switches, as well as links between the network elements. Other network management systems implement a network management function known as root cause analysis. Typically, a network problem has caused observable changes in attributes and states of entities in the network. As a result, a plurality of events may be emitted by one or more source entities in the network that happen to observe the attribute changes and the state changes caused by the problem.
Under some approaches, root cause analysis may be performed using causality graphs constructed by the events collected. If such approaches converge to a solution within a finite amount of time, the constructed graphs may indicate root causes for problems in the physical network. The existing techniques for root cause analysis, such as those constructing causality graphs using events as input, may take an inordinately long time to converge or fail to converge at all, especially when the number of the events is large. In addition, the techniques may not robustly deal with a situation where key events are missing. Since events are typically collected using unreliable transport protocols such as syslog or a trap mechanism of Simple Network Management Protocol (SNMP), some key events may not reach the network management system.
Some existing techniques configure a time window to disqualify (or remove) all the events outside the window from the root cause analysis for efficiency purposes. However, because network problems and their symptoms propagate at different rates and appear at different times in different locations of the physical network, it is often difficult to configure such a time window properly to realize an objective of excluding irrelevant events while, at the same time, including relevant events.