In a computer network, alarm root cause analysis is usually divided to device level alarm root cause analysis and network level alarm root cause analysis. The device level alarm root cause analysis concerns about relations among alarms generated by an individual device while the network level alarm root cause analysis concerns about relations among alarms generated by multiple devices. Currently, the device level alarm root cause analysis already has mature implementations, but the network level alarm root cause analysis is limited because a network management system lacks network end-to-end path information.
An alarm root cause analysis solution in the prior art is based on a service object model; dependencies between objects may be obtained by analyzing the service object model. When one object fails, an alarm is generated and objects depending on the failed object also generate alarms as being affected. The former is a root cause alarm and the latter are derivative alarms. Therefore, when alarms are generated between objects, the relations between alarms can be obtained according to the dependencies between objects. Further, alarm correlation rules can be generalized according to these relations. When alarms are generated, the alarm root cause analysis may be performed according to the alarm correlation rules.
FIG. 1 is a schematic diagram of a network segment based on a service object model in the prior art. As shown in FIG. 1, the network segment is formed by three devices, namely device A, device B, and device C, where device A, device B and device C are respectively used as a node (such as a router) in the network segment. A service object model, which includes objects such as a card (Card), a physical port (Physical Port), an interface (Interface), a tunnel (Tunnel), a virtual private network (Virtual Private Network, VPN), a border gateway protocol peer (Border Gateway Protocol Peer, BGP Peer), is built on device A and device C respectively. In the service object model, an upper layer object is dependent on its lower layer objects. In this way, when an object, such as a physical port (Physical Port), borne on device A fails, device A raises an alarm, and being affected, the Interface, Tunnel, VPN and BGP Peer borne on device A also generate alarms. Meanwhile, the BGP Peer on device C also generates an alarm. The root cause analysis between the alarms may be performed according to the relations between the objects in the service object model.
In another aspect, if device B fails, device B generates an alarm; meanwhile, as device B fails, a path between device A and device C also fails. In this case, objects borne on device A and device C also fail accordingly and generate alarms. Intuitively, the alarm generated by device B should be a root cause alarm and the alarms generated by the objects borne on device A and device C should be derivative alarms. However, a prerequisite of such network level alarm root cause analysis is that the network management system must know path information between device A and device C.
A feasible solution is that the network management system collects network routing information in real time and caches routing information in a latest period of time (usually a few minutes) for alarm root cause analysis when a device fails. But this solution requires the network management system should collect routing information in all devices in the network, which consumes many resources of the network management system and can hardly guarantee realtime collection. When a large volume of routing information exists, a lot of bandwidth resources and device resources are surely consumed, thereby affecting the network performance. Ultimately, a comprehensive implementation of the network level alarm root cause analysis cannot be realized.