One important purpose of generating alarms in communication networks is to alert the network operator of deviations from normal operating conditions. Alarms are one type of network event. Network alarms can be e.g. so-called “absolute alarms”, triggering when system parameters reach a certain value; or they can be so-called “delta alarms”, triggering when one or more system parameters changes a certain level measured per time interval, such as e.g. the throughput. Network alarms are notifications of possible or actual network problems. It provides important network information for network operators to monitor and resolve network issues.
However, since network systems often are large and complex, it is common that a network operator receives, and need to consider, hundreds of even thousands of alarms from a managed network. According to an example of a real network statistics, a network generates in average about 60 000 alarms every day from different levels of network elements. This absolute flood of alarms creates some major problems for network management. For example:                The network must be able to handle the large amount of alarm data being communicated in the uplink/upstream, and towards a management node.        Investigating and resolving large numbers of different types of alarms separately is difficult and time consuming, and results in longer impacts for network users.        It is often impossible for operators to handle the huge amount of received alarms manually to resolve network issues.        
The current common approach to handle the flood of alarms is utilizing alarm filtering functions of a so-called Fault Management (FM) system in an Operations Support System (OSS). FM in OSS refers e.g. to the handling of notifications sent by a network element or service when there is any error or fault at the network element or in the communication between the OSS and the network element. The basic purpose of FM in OSS is to receive, process, persist, display and communicate errors/alarms to other systems. The alarm filtering feature allows applying of filters before alarms are passed on to other consumers in the OSS or outside of the OSS.
Complex filtering and association analysis are two major approaches developed for handling the flood of alarms in current network management systems.
The filtering approach allows a system to automatically filter out specified or identified alarms with an updated alarm representation, and thereby e.g. reduce the number of alarms represented and/or presented to operators. Different filtering techniques can be used depending on alarm types [2] [3]. For example, a frequency-based filter can be used to identify an alarm exceeding a threshold value a certain number of times within a certain period of time.
The association analysis approach allows a system to automatically combine alarms with strong association to reduce number of alarms. The basic concept is utilizing so-called association rule mining techniques to discover alarm relationships based on statistics of how often alarms occurred together [4]. For example, a stronger relationship will be given if two alarms always, or at least often, occur together.
Current FM systems generally offer alarm filtering features to filter out alarms based on alarm attributes. This allows human experts to define alarm filter rules, and thereby a significant amount of alarms presented to operators can be reduced. For example, alarms for some network services, or alarms associated with certain problems, could be filtered out. Further frequency-based filters could also be added. However, such systems still failes when it comes to allowing the operator to respond quickly and accurately to the alarms that require immediate actions. Two main problems are:                1. Manually defined expert rules are required and they are not flexible and adaptive. Manually defined rules are completely dependent on engineers' or experts' knowledge on the current network system. This static approach is not flexible and adaptive for the dynamics of networks. For example, when a network is reconfigured, new alarms or alarm logics are introduced, and such static rules do not have the required knowledge for handling the network changes.        2. Different types of alarms have to be investigated separately, even if they are caused by the same fault. Since a network system is typically complex and interconnected, a single network problem might affect hundreds of network elements or services, resulting in thousands of alarms of different types. Filtered alarms of different types still need to be investigated separately by operators. It is often extremely difficult for operators to understand the root cause, and find the significant alarms for a specific root cause.        
Association rule analysis is an adaptive approach which does not require static rules. Associations, such as alarm relations can be automatically learnt from historic alarm data. Also when the network is changed, such as at topological structure changes, the system can learn updated alarm associations from newly collected alarm history. However, one major problem for association analysis is the handling of constant noise alarms.
Some alarm types may occur every few seconds or minutes, thereby generating noise in the alarm data. The problem may be due to that the network issues causing these alarms are not significant enough to be fixed, or to that systems automatically does periodic checking, which triggers the same alarms each time, etc. However, co-occurrence count based association analysis will give that these noise-generating alarms have some relationship with all other alarms, since it occur every time when other alarms occur. As a consequence, this results in inaccurate alarm relations for constantly or frequently occurring alarms.
Consequently, none of the existing approaches is considered sufficient when it comes to handling of and analysis of network alarms.