Known fault localization methods aim at finding the correlation between fault carrying events and a network fault. This is usually a difficult task due to the relatively large amount of events caused by a network fault.
An important limitation of such methods is that they limit the scope of correlation to fault carrying events. In reality, in a complex communication system, such as mobile systems, it may not be necessarily sufficient to process fault events alone to succeed in fault localization, because fault events themselves may be an effect of a real fault and may even occur as symptoms of the real fault. These symptoms may be localized far away from the actual location of the fault.
For example, subsystem I has a retransmit timer, which can be set so that a service II experience a large access delay because of that its state machine delay requirements are often exceeded, which now is due to the interaction between the retransmit timer of subsystem I and its own state machine. In this example, service II accordingly reports a fault, whereas subsystem A does not, although the problem is located in subsystem I.
Moreover, common communication systems have in fact limited the set of possible fault events, for the reason that it may be difficult to represent all possible problem sources. For example, routers typically report lost packets. However, they do not report packet reordering. Packet reordering as such is not considered to be an important performance fault in transport networks. Nevertheless, there are services that are sensitive to high levels of reordering of packets. This type of fault events, packet reordering in routers, is therefore not detectable by fault management systems that are purely based on fault reporting from the network.
It is known from existing techniques to analyze the dependence on different network subsystems that are fault event or alarm based. However, these techniques seem to search for dependencies between the fault event(s) and the problem causing the fault events, within one or more different subsystems. Dependencies between for instance an alarm in a certain subsystem and the cause of the problem, if the cause resides in a different subsystem, are thus not considered.
Some techniques correlate events within a subsystem and that are capable of finding non-trivial dependences between network faults that may be hidden from an ordinary Operations, Administrations and Maintenance (OAM) system. However these are not applicable to system levels, for the reason that they fail to relate the various services to different subsystems.
From US-2003018228-A1 it is known a graph-based dependence mapping technique, which describes fixed dependences and which is therefore usually not applicable since this assumption is generally not feasible.
There is still a need to provide a generally applicable solution to the problem of localizing faults within networks, comprising different subsystems in a reliable way.