Networks generally become increasingly heterogeneous and complex. Most likely future networks will be service-driven and the user will expect constant service availability on any network to which the user has access. Such networks normally will consist of a large variety of different access networks and core networks and they will be required to offer many services simultaneously. In addition thereto they will exhibit a much more dynamic behaviour than current networks do in order to be able to substantially in real-time, adapt to end user needs for best quality of experience (QoE) and operator needs for optimal resource management at reasonable operator expenditure (OPEX). These factors make network management complicated and the requirements and the expectations that network operators are able to offer (user-centric, end-to-end, always-best connectivity) become high. Particularly it requires network management systems which are complex, distributed and to a large extent adaptive to changes in the network. This among others drives the development towards policy-based network management which is adapted to deploy expert knowledge in the network regarding services, interaction between services, user preferences and strategic views of business to allow the network to make decisions on how to manage these services in a dynamic, heterogeneous multi-service environment.
Policy-based network management is for example discussed in “Policy-Based Network Management: Solutions for the Next Generation”: ELSEVIER, 2004, by J. Strassner.
In any distributed self-managed network, for example driven by policies, the devices of the network exhibit individual behaviour in order to fulfill a service and/or user requirements. This individual behaviour will affect the network as a whole. Therefore it becomes crucial to be able to observe the behaviour of the network for purposes such as forecasting and detection of undesired behaviour, malfunctioning etc. In order to be able to monitor the behaviour of the network, composed of network devices, services and users, the management system must monitor events relevant to the network as well as the status of the network. In order to be useful the management system should infer both how and what the network is doing (events relevant to the network) and how this impacts the status of the network. Ideally, the management system should extrapolate what may happen in the network based on knowledge about what has happened in the network in the past. For this purpose so called Key Performance Indicators (KPI) and Key Quality Indicators are used which describe how network operators evaluate the efficiency and effectiveness of their exploitation of existing network resources. These indicators can be based on a single performance parameter such as number of missed calls on a network device or in a network. They can also be based on complex equations involving multiple network parameters.
Network device KPIs are calculated for individual network devices and indicate the performance of the respective devices. Network level KPIs are based on aggregations of network device KPIs or other measurements based on specified formulae and are indicators of the overall performance of the network as a whole.
Traditionally key performance indicators have been calculated on the basis of historically recorded data on a regular basis. These historical KPIs have been used to determine how well the network was performing its functions in the past and to retrospectively identify problems that may have occurred in the past. Operators set threshold levels for adequate KPI performance. A KPI is violated when it drops below the specified threshold resulting in poor performance on the device or in the network.
It is possible to, in retrospective reports, establish dips in a graph indicating KPI violations. For an operator it then becomes possible to identify that on for example particular days the network have been under-performing. To identify what caused the performance degradations on those days, the operator has to revert to for example call data records, alarm logs and trouble ticket logs. Such historical KPI calculation and presentation has been widely used in various network management solutions. There are many tools which perform such a function with varying degrees of elegance, efficiency and automation. In recent times, the functionality to monitor key performance indicators in real-time are highly demanded from network operators. Today some tools exist which provide real-time key performance indicator monitoring which in turn allows users to define new KPI formula, KPI violation threshold, to set up alarms and to present the current status of KPIs.
However, with the existing tools it is only possible to react to performance degradations that have already occurred. Such retrospective approaches are clearly unsatisfactory since they leave no possibility to operators to respond to KPI violations in a timely manner. The only possible reactions at such a late stage consist in trying to identify the root cause of a violation which is a very onerous task and entirely relies on experts in order to prevent problems recurring, and, for managed service operations, e.g. to pay penalty fees for any past violations. Such solutions are clearly unsatisfactory.