The capacity of long-haul communication systems, such as “undersea” or “submarine” systems, has been increasing at a substantial rate. For example, some long-haul optically amplified undersea communication systems are capable of transferring information at speeds of 10 gigabits per second (Gbps) or greater on a single optical channel. In order to maximize the transmission capacity of an optical communication system, a single optical fiber may carry multiple optical channels (e.g., 64 or more) in a process known as wavelength division multiplexing (WDM). Because such a high capacity communication system is particularly subject to risk at various points on the network, network management and remote diagnosis have been used by system owners and operators to meet Service Level Agreements (SLAs).
A simplified communication network 10 is shown in FIG. 1. The communication network 10 is comprised of interconnected equipment referred to as network elements (NE) 12. In an optical communication network, for example, network elements can include transceivers, amplifiers, combiners, splitters, and telemetry equipment. As the number of transmission channels in a fiber and the number of fibers in a cable increases to accommodate the increased capacity of the optical network, the amount of equipment or network elements 12 also increases. Multiple network elements 12 can be housed together at a processing location or node 14, which sometimes is referred to as a cable station in a communication network. Field personnel can be located at the node or cable station to maintain the equipment.
Network management or traffic control activities are coordinated at a Network Management Center (NMC) or centers, 16 connected to the network nodes 14. A Network Management System (NMS) 18 can be located at the NMC 16 to provide data used for proactive maintenance and network capacity planning. One type of NMS 18 provides a comprehensive, graphically integrated view of the network topology for use in monitoring and trouble-shooting activities.
The NMS 18 may responsible for providing fault management by manipulating and storing fault indicators such as network element Quality of Service (QoS) alarms that indicate the violation of SLAs. In addition, the NMS 18 may be used to provide other network management functions such as configuration management, performance management, security management, and accounting management. At the high-level NMC 16, operators using the NMS 18 may access and/or manage network components (e.g., the individual nodes and/or network elements). At some nodes 14, field personnel can be given access to the NMS screens pertaining to equipment under their control or remotely managed nodes.
Using the NMS 18, network operators may diagnose and maintain communication networks using a centralized approach. The NMS 18 maintains a centralized decision process using a centralized server and an operator at the NMC 16 essentially coordinates management across the whole network. Correlation rules and topological configuration information are centrally located for the entire network and a centralized alarm correlation and root cause analysis is performed. This centralized approach to fault diagnosis often excludes expert knowledge distributed throughout the NMC area of control and does not adequately adapt to changes in network topology. In a global network where nodes may be widely distributed geographically, command and control issues may also arise.
Accordingly, there is a need for a system and method for fault diagnosis that shares any new diagnostic knowledge between the nodes and distributes the alarm correlation to local points or nodes in the network. There is also a need for a system and method for fault diagnosis that provides hierarchical processing at both the node level and at a higher level.