Described below is a method of operating network entities in a communications system, in particular in a communications system having a management network with at least one hierarchical level for a management of the communications system. Additionally, described below is a computer program implementing the above mentioned method. Further, the network entities in a communications system exchanging fault management and/or performance management related data.
According to the principles of a network management system, a typical management of a communications system, for example, of a telecommunications system, has several hierarchical levels for the management of the communications system. Hierarchical level of a management network means that every level in the management network of the communications system has a certain management and/or communications system related functionality specific for this level, and that, depending on its hierarchical position in the network, it performs a certain management function. Each of these hierarchical levels, with the exception to the top level and the first-line level, has a double management function—manager function and agent function. Each hierarchical level, with the exception of the first-line level, has a manager function with regard to the underlying level, and every hierarchical level, with the exception of the top level, has an agent function with regard to the level before. Thus, management of a communications system features a hierarchical structure clearly defining the functions at every hierarchical level of this communications system or of the management network of the communications system respectively. See an example provided in FIG. 1 representing three levels of a hierarchical management structure in a telecommunications system and described below.
Each level has corresponding entities or elements being of physical and/or abstract nature. Thus, an entity of a hierarchical level can be a software and/or hardware (device) in a communications system. In the following, such entities or elements will be referred to as “network entities”. Depending on the level of the management network performing a functionality of a manager or an agent, or both, these network entities are managers, agents, or both. In the following, the terms manager or agent will be used in dependence of the functionality of the corresponding hierarchical level and, thus, in dependence of the corresponding network entity of this hierarchical level. For this reason, if a level represents both the management and the agent level, a network entity in this hierarchical level will be a manager or an agent depending on the function to be performed in a given moment by this network entity.
Network management as such refers to the Operation, Administration, and Maintenance (OAM) of communications systems or networks like telecommunications networks at the top level. Network management is the execution of a variety of functions required for controlling, planning, allocating, deploying, coordinating, and/or monitoring the resources of a network, including performing functions such as initial network planning, frequency allocation, predetermined traffic routing to support load balancing, cryptographic key distribution authorization, configuration management, fault management, security management, performance management, bandwidth management, and/or accounting management. Further, in such a management system hardware and/or software are provided that support OAM functionality and provide these functions, for example, to network users and/or administrators. Thus, OAM includes facilities for operating, managing and maintaining networks.
Managers in a communications system are configured to start operations for the operation, administration and maintenance of the communications network including configuration, fault and/or performance management (CM, FM, and/or PM) of the communications system, for example, as mentioned above. It is done by sending requests, which are performed by the agents, in particular, by the agents assigned to the corresponding managers. The managers receive then corresponding feedbacks, called responses, from the agents.
Network entities implementing the functionality of an agent in the communications network recognize events relevant for the operation, administration and maintenance of the communications network (e.g. alarms), generate corresponding notifications, and transmit these notifications, usually as event reports, to the managers, in particular, to the managers the network entities are assigned to. Thus, an efficient network management is enabled.
The provision of OAM functionality like CM, FM and/or PM, for example, is assured by communication between the hierarchical levels of the management network of the communications system, wherein the network entities of an upper level manage the network entities of the underlying level to ensure a correct performance of the OAM functionality and the managed network entities act depending on the management of the upper management level. Further, in the management network of the communications system a strict assignment exists between managers and agents. A manager has a certain set of agents it has to manage. Agents, in turn, are assigned to one manager. Thus, the performance and safeguarding of the OAM functionality is done in a strict hierarchical way between the levels of the management network of the communications system.
Configuration Management (CM) serves the purpose of making whole networked and distributed system available, while FM and PM keeps the system operational, or restores an operational state. The most important CM tasks are inventorizing or checking and noting configurations and/or distribution of (hardware and/or software) entities, elements, and/or components of a communications system; and appropriate management to ascertain the changes applied by communications system (hardware and/or software) entity, element, and/or component distribution, and where appropriate to implement a corresponding reconfiguration. Additionally, CM is also responsible for installation of documentation and directory services.
Fault Management (FM) includes functions for detecting, isolating, and correcting malfunctions in a (tele-) communications network. FM and its functions compensate for environmental changes, and include maintaining and examining error logs, accepting and acting on error detection notifications, tracing and identifying faults, carrying out sequences of diagnostics tests, correcting faults, reporting error conditions, and localizing and tracing faults by examining and manipulating database information. Thus, when a fault or another FM related event, any causing initiation or implementation of at least one FM related function, occurs, a network component will often send a notification to the network operator using a protocol, such as SNMP for example. An alarm is a persistent indication of a fault that clears only when the triggering condition has been resolved.
Performance Management (PM), in turn, records the system load and displays performance bottlenecks and has a direct influence on network deployment, network extensions and error management. Parameters such as the response time, round trip time, and delay time are important for PM, as are the theoretical performance limits and network load. These parameters are influenced by a number of transmission characteristics such as flow control, access method, attenuation or packet loss rates. PM allows operators to monitor network load and detect performance trends for future network planning. Thus, when a performance bottleneck of another PM related event occurs in the communications system, at least one PM related function is then performed.
The communication between the hierarchical levels of a management network of a communications system and thus between the managers and the agents is usually facilitated by management interfaces, called OAM interfaces. The implementation of these interfaces can be preformed, for example, by appliance of protocols like Simple Network Management Protocol (SNMP), Transaction Language 1 (TL1), Extensible Markup Language (XML), or Common Object Request Broker Architecture (CORBA).
When managing network entities of the first-line level or one of the upper levels, due to hierarchical character of the management structure, this management is performed by at least one of the upper levels via OAM systems and/or corresponding OAM interfaces respectively. Thus, to maintain consistency of the communications network, typical OAM functions like CM, FM and/or PM, for example, concerning the network entities of the first-line level or one of the upper levels are performed by the network entities of the upper levels, wherein a well coordinated information post processing and correlation has to be performed by the corresponding managers and a well coordination of the responsible managers has to be done. To ensure a stable and errorless performance of OAM functions like FM and/or PM, for example, often human network operators have to be involved into the management processes. In such cases, a human network operator has to be able to manage and overlook a variety of data concerning a variety of network entities. Thereby, the known implementation of OAM functions like FM and/or PM, for example, requires a frequent intervention and a regular monitoring and control by human operators being a very complex task. Thus, known implementation of OAM functions like FM and/or PM has the disadvantage of a low degree of automation.
Further, an OAM system with known implementation of OAM functions like FM and/or PM, for example, has large processing requirements, as a large number of alarms and high amount of further OAM functionality relevant or related data (like FM and/or PM related data) has to be exchanged. This heavy data traffic, in turn, causes requirements of high network bandwidth.
As already outlined above, a known OAM system responsible for FM and/or PM of the communications system, for example, is designed by several hierarchical levels of a management network of a communications system, for example, of a telecommunications system. FIG. 1 represents three hierarchical levels of such a management network of a telecommunications system.
In the following, FM and/or PM being important and typical OAM functions will be regarded in more detail.
As already outlined above, the FM and/or PM is performed by providing FM and/or PM related data from the lower levels to the upper levels, where FM and/or PM relevant or related decisions are made, and results of these decisions are then transmitted from the upper levels back to the lower levels.
At the first line level 152, the management network of a telecommunications system has network elements (NEs) 121, 122, 123, and 124. In the following, this hierarchical level 152 will be referred to as the “NE level”. A network element (NE) 121, 122, 123, 124 is a kind of telecommunications (hardware) equipment or element that is addressable and manageable. A network element (NE) can also be seen as a combination of hardware and software or a network entity formed of software that primarily performs telecommunications service functions or predefined and a priori agreed upon functions and, thus, provides support or services to users, for example. NEs 121, 122, 123, 124 are interconnected and managed through at least one Element Manager System (EMS) 111, 112 in the upper management level 151, which will be referred to as the “EMS level” in the following. The NE level 152 performs the agent functionality, and the EMS level 151, in turn, performs a manager functionality with regard to the NE level 152 and an agent functionality with regard to the upper level 150 in the hierarchy of the management network.
An EMS 111, 112 is a manager of one or more of a specific type of NEs 121, 122, 123, 124 and allows to manage all the features of each NE 121, 122, 123, 124 individually. Each of the NEs 121, 122, 123, 124 is connected to one responsible and managing EMS 111, 112 via appropriate links. The communication between the NE level 152 and the EMS level 151 and thus between the NEs 121, 122, 123, 124 and the EMS 111, 112 is ensured by management interfaces 141, 142, 143, 144, like EMS/NE Operation and Maintenance (OAM) interfaces, implemented on the links between the NE and EMS level 152, 151. Such connections between the EMS and NEs are called also “southbound” connections.
EMS 111, 112, in turn, are managed by an Operations Support System (OSS) 100 of the top level 150, in the following referred to as the “OSS level”. The OSS 100 monitors the underlying management layers 151, 152 and predominantly looks at functional and nonfunctional requirements of the communications system and of the underlying layers 131, 132. The OSS level 150 performs just a manager function with regard to the underlying EMS level 151. The communication between the OSS level 150 and the EMS level 151 or the OSS 100 and the EMS 111, 112 respectively is enabled by links between the two levels, wherein management interfaces 131, 132, for example EMS/OAM interfaces, are implemented on these links for this purpose. The connections or links between the OSS level 150 and EMS level 151 are also known as “northbound” connections.
The NE level 152 or the NEs 121, 122, 123, 124 there and OSS level 150 or the OSS 100 there, monitor permanently the system performance of a live network. When problems occur countermeasures have to be taken in order to maintain the quality of service (QoS) at acceptable levels. In the systems operating conventionally, this process involves transferring data across numerous (vertical) interfaces between hierarchical systems. In addition to this, the process is not automated from an operator's point of view. Either the operator has to initiate corrective actions manually or provide himself a system to assist him in this task.
Procedural examples for FM and/or PM as performed in a known system are shown in FIGS. 2 and 3. In FIG. 2, two management levels, the NE level 252 and the EMS level 251, are shown. The network entities 221, 222, 223, 224, and 225 represent the NEs in the NE level 252, and the network entity 211 represents an EMS in the EMS level 251. In FIG. 2a, a FM and/or PM related event like a fault (represented by a lightning) occurs at the NE 222. This event may cause initiation or implementation of at least one FM and/or PM related function in the corresponding communications system or at the NE 222 respectively. Thus, as a consequence, the NE 222 sends FM and/or PM related data, here an alarm, to its EMS 211, wherein the sending of the alarm is shown with a bold arrow leading to the EMS 211. NEs 223 and 224 have dependencies on NE 222 or relevant configuration management relationships with NE 222 respectively (shown as dashed lines between NE 222 and the NEs 223, 224).
Because of the dependencies or relevant configuration management relationships respectively, as shown in FIG. 2b, the NEs 223 and 224 detect a fault (represented by a lightning at NEs 223, 224) at some later point in time. After the detection of the fault, they can then react to this fault (thus, initiate or implement FM and/or PM related functions) and send also FM and/or PM related data, here alarms, similar to the alarm of NE 222, to the EMS 211 managing beside NE 222 also NEs 223 and 224. Also here, the sending of the alarm is shown with a bold arrow leading to the EMS 211.
Further, it has to be noted, that the NEs 221 and 225 have no relevant configuration management relationships (or dependencies) with the NE 222. For this reason, they are not involved in the given configuration process.
If all concerned NEs are managed by the same EMS, as it is the case FIG. 2a, it would be possible to initiate corrective actions in NE 223 and NE 224 upon reception of the alarm from NE 222. However, this is not possible if the concerned NEs are managed by different EMS. This situation is shown in FIG. 3.
In FIG. 3, the management levels of the NE level 352, the EMS level 351, and the OSS level 350 are shown. The network entities 321, 322, 323, 324, and 325 represent the NEs in the NE level 352, the network entities 311 and 312 represent EMS in the EMS level 351, and the network entity 300 is the OSS of the OSS level 350. There, NEs 323 and 324 having CM relationship with (or dependencies on) the NE 322 (shown as dashed lines between NE 322 and the NEs 323, 324) are managed by different EMS 311 and 312.
FIG. 3a shows the situation, when a FM and/or PM related event like a fault (represented by a lightning) occurs at the NE 322. Like the situation in 2a, as a consequence, the NE 322 sends an alarm (FM and/or PM related data) to its managing EMS 311. Also here, the sending of the alarm is shown with a bold arrow leading to the EMS 311.
Because of the dependencies or relevant configuration management relationships respectively, the NEs 323 and 324 detect a FM and/or PM related event—a fault (represented by a lightning at NEs 323, 324)—at some later point in time, as shown in FIG. 3b. After the detection of the fault, they can then react to this fault (initiate or implement FM and/or PM related functions) and send alarms, similar to the alarm of NE 322, to their managing EMS 311 and 312. Also here, the sending of an alarm is shown with a bold arrow leading to the EMS 311 or 312.
For an efficient and fast reacting to FM and/or PM related events concerning several NEs and, thus, for an efficient and fast PM and/or FM, wherein a variety of PM and/or FM related functions have to be initiated and/or implemented in the concerning NEs, it is desirable to start corrective actions in NE 323 and NE 324 upon reception of the alarm from NE 322 in a more effective way. However, this is not possible as the concerned NEs are managed by different EMS. In the present case, the EMS 311 and 312 have to provide the fault or configuration related data to the managing OSS 300. This situation is shown in FIG. 3.
Thus, in a known (tele-) communications system, as presented by FIG. 1, the stable and errorless operating of network entities of a communications system important for well functioning FM and/or PM is assured at the EMS and/or at the OSS level 150, 151. At the EMS level 151, such an operating can be assured only for the NEs managed by the managing EMS, typically the NEs of a single vendor. In FIG. 1, the such an operating of NEs 121, 122 is assured by the EMS 111, and the stable and errorless operating of NEs 123, 124 is assured by the EMS 112. At the OSS level 150, in turn, the FM and/or PM related operating can be assured between NEs attached to different EMS. In FIG. 1, this FM and/or PM related operating of NEs 121, 122, 123, and 124 attached to the EMS 111 and 112 is assured by the OMS 110.
Thus, when considering the known implementation of FM and/or PM, the FM and/or PM related data like alarms is always sent by an NE to the managing EMS. The managing EMS, in turn, forwards this data (e.g. alarms) to the corresponding managing OSS, where the alarms (FM and/or PM related data) are post processed and also correlated with further FM and/or PM related data in order to find the root cause. After doing so, the system or operator may react to the network problems. In most cases this is a reconfiguration of the network in order to restore/assure a certain QoS.
Operating in this way the known systems provide following disadvantages: All FM and/or PM related data has to be transferred to a higher level (network management (NM), OSS). This requires high bandwidth and processing power. Further, such a management implementation is slow and does not allow for a fast reaction to network problems, and leads to the known implications like reduced QoS, customer satisfaction decreases, etc. Furthermore, such an management process is typically not automated. The operator has to analyze the data by himself and also take corrective actions by himself. This task may be assisted of course by some applications, but these applications then have to be provided by the operator or a systems integrator. These parties typically do not have the in-depth knowledge of the different NEs required to fully exploit the FM, and/or PM correlation. Thus, the process is error-prone because some correlation is lost when data is being passed upwards from the NE to EMS to NM and then to the OSS. Additionally, such an implementation does not effective use all resources of a communications system. Thus, for example, available direct interfaces between NE (e.g., in third-generation (3G) Long Term Evolution (LTE) and/or between EMS are not exploited.
One way would be exploiting the direct interfaces between the network entities of an agent level like NEs or EMS, for example, by sending the corresponding network FM and/or PM related data between the NEs or EMS directly via these interfaces. Here, it has to be noted that between NEs and EMS direct communications links exist, which are used to carrying only traffic related to call processing. In such a case, communications links responsible for call processing could be used. Thus, upon reception of an alarm the NEs could take actions in order to minimize the performance impact of the alarmed NE on other NEs. However, a problem is that for call processing purposes a NE (or EMS) is connected to numerous other NEs (or EMS). Thus, a NE (or EMS) would send alarms to all NEs (or EMS) to which a physical or logical communications link does exist. Thus, this way would again provide the above disadvantages of high bandwidth and processing power, slow reconfiguration process, etc.