In a telecommunication network, it is necessary to measure, monitor and act on key parameter indicators (KPI:s) on the Network Elements (NE) and on the links that connect them to maintain the network in an operational state.
There exist specialized units known as Network Operation Centers (NOC:s) configured specifically to monitor KPIs and take appropriate actions when the thresholds of those KPIs are breached.
Generally, KPIs are defined in the Network Element's maintenance manuals and vary for each NE based on its functional scope of existence. The KPIs are exposed to NOCs as MiB (management information base), OiDs (Object Identifiers) and NOCs use applications that support the standard Simple Network Management Protocol (SNMP) to query the KPIs on a periodic basis to monitor their behaviour in terms of exceeding a threshold leading to an alarm that eventually makes the system unstable.
In these aspects, it would be convenient if the next generation NOC solutions strive to predict an alarm condition that is imminent, so as to ensure the required resources are in place to act on it immediately there by reducing the lead time to fix the same which otherwise incur huge revenue loss to the network operator.
Most of the existing solutions primarily address the problem of determining alarm conditions in telecom networks, rather than predicting alarm conditions. Some of the anomaly/outlier detection methods that are used in these aspects are distance based techniques (k-NN), Cluster analysis, Classification (SVM) techniques. Again the challenge in choosing the right method depends on the nature of the KPI. Existing approaches use history logs to build a predictive system. Also, existing approaches are offline learning methods.
U.S. Pat. No. 6,353,902 discloses a system for proactive maintenance of a telecommunications network. A database is created containing characteristics of a plurality of valid logs. These valid logs represent alarms within a network which report status and abnormalities in the network and which have been specifically selected by a network domain expert or administrator from a larger group of logs. The characteristics correspond to a pattern of network fault parameters. The network is monitored for occurrences of a valid log. When a valid log is encountered, future occurrence of a fault is predicted based on an analysis of the valid log and the characteristics found in the database. Corrective measures are taken to prevent the fault from occurring.