Today, optical networks have established themselves as powerful high-bandwidth communication networks. However, their components are still exposed to a variety of service disruptions, including hard and soft failures with long or very short time intervals. Despite the fact that soft and intermittent failures are the most frequent ones in this type of network, hard failures with long disconnection time are the most destructive. In terms of hard failures, cable-cuts are more prevalent than switch-crashes since optical cables are extended widely and subsequently have a higher likelihood of exposure to environmental damages.
In transporting a high volume of data, any severed Wavelength Division Multiplexed (WDM) link can lead to the loss of several terabits per second. Consequently, the network performance can be harshly degraded and the robustness of the network can be shattered. Therefore, maintaining a certain level of network performance, or at least minimizing the effects of link failures, is an important issue that needs to be addressed in the field.
In general, there are two types of dynamic restoration approaches employed to recover the affected networks, link restoration or path restoration cf S. Ramamurthy, L. Sahasrabuddhe, and B. Mukherjee, “Survivable WDM Mesh Networks,” IEEE Journal of Lightwave Technology, vol. 21, no. 4, pp. 870-883, April 2003. Debates over the pros and cons of these two techniques continue with the cycle of new approaches improving on previous ones. However, the following comparisons highlight some of the drawbacks of methods and systems for fault localization and restoration known in the prior art.
It is commonly understood in the prior art that link restoration is a point-to-point technique, while path restoration is an end-to-end restoration technique. Link restoration restores the affected path with generally less number of links than path restoration. Consequently, link restoration decreases the number of to-be-reconfigured switches and increases the probability of success. In addition, as link restoration maintains a number of working links (segments of path) in place, it preserves the network load balancing, thus avoiding unnecessary “chaos” in the system. However, link restoration can create a congested area around the failed link or cause long restoration loops. While the latter phenomenon could severely alter the protocol functionality of link restoration, typically this only happens in highly loaded networks.
Link restoration is followed by a fault localization interval. The fault localization time for a larger network can be considerable and undesirable. However, at the expense of the fault localization delay, network routing tables are validated and the restoration is completed faster.
In contrast, path restoration implementation does not require any fault localization and is immediately started after a fault alarm is detected. However, path restoration can be time-consuming for distant Source-Destination (S-D) pairs or in heavily loaded networks. This is mostly due to performing rerouting and switching procedures and often repeating these procedures for alternate paths before successfully establishing a restoration path.
In summary, link restoration removes only the failed link capacity by pinpointing the failure location, while path restoration removes the affected path capacity from the network resources by excluding all of the path links.
All-optical networks are designed based on different models and control mechanisms. Optical components are varied in terms of power monitoring or spectrum analysis. Some all-optical components, such as Optical Amplifiers (OA), have limited or no electronic monitoring and analysis abilities. As a result, they may be able to detect loss of signal but cannot define high Bit Error Rate (BER) and/or manage any fault localization procedures. In contrast, there are components capable of detecting failures and able to take proper action in response to service disruptions, for instance Optical Cross-Connects (OXCs).
All-optical networks can be created based on different design topologies such as overlay, augmented, peer-to-peer or integrated. The all-optical WDM network architecture considered herein is the overlay model. In this structural design, optical switches interconnect data links and create the data network, while the control units including electrical/optical/electrical (E/O/E) conversions and optical amplifiers interconnect control links and construct the control data network, also referred to as supervisory channels.
The data network, consisting of the optical switches and data channels, operates in a circuit switching fashion, while the control data network operates in a packet switching way. The traffic in the control data network consists of small control packets resulting in much lighter traffic. Therefore, the control channel is usually implemented by one or more dedicated wavelengths in the same fiber link. When a connection request is arrived, a control packet in the control data network routes and configures switches to create a transparent optical data path, namely a lightpath. Different criteria are considered and various techniques are employed to set up the most resourceful lightpaths cf H. Zang, J. P. Jue, and B. Mukherjee, “A Review of Routing and Wavelength Assignment Approaches for Wavelength-Routed Optical WDM Networks,” SPIE Optical Networks Magazine, vol. 1, no. 3, pp. 47-60, January 2000.
Unlike Synchronous Optical Network (SONET) networks, which operate point-to-point using the peer model, all-optical transparent networks function end-to-end. As a result, these networks could drop and analyze data only at the end points of established lightpaths (sinks).
Control mechanisms are primarily developed based on either centralized or distributed models. Despite the fact that centralized control mechanisms are relatively simple and work well for static traffic in small networks, they are not considered to be feasible for dynamic expanding systems. In contrast, distributed control mechanisms are complex but more scalable and reliable than the centralized ones. Thus, distributed control models are employed to manage dynamic traffic in large networks. In addition to those models, there are hierarchical models which are a combination of centralized and distributed models.
Hierarchical models are mostly applied to increasingly large networks and dynamic information systems because of their ability to coordinate the network controlling messages. However, hierarchical management models cannot be economically and practically implemented for all network topologies. Subsequently, researchers are still pursuing distributed rather than hierarchical management structures for mesh networks.
The optical layer protection schemes are similar to SONET/SDH (Synchronous Digital Hierarchy) techniques. However, their implementation is substantially different. The optical layer consists of the Optical Channel (OCh) layer also known as the path layer, the Optical Multiplex Section (OMS) layer (line layer), and the Optical Transmission Section (OTS) layer cf R. Ramaswami and K. N. Sivarajan, “Optical Networks: a practical Perspective,” Morgan Kaufman, 1998, as shown in FIG. 1 of the prior art.
Nevertheless fault localization can be technically achieved in different layers, for instance in the physical layer through electronic processing and using photodiodes and/or spectrum analyzers.
In SONET, the downstream node attached to the disconnected link detects a failure and reports it to the network management entity. The fault condition then is communicated with the neighbouring nodes to inhibit them from issuing false alarms by management. However, fault localization in SONET involves examining overhead at each node, which slows down the fault localization procedure.
In optical transport networks (long-haul), the fundamental philosophy of the SONET frame has been adopted with a more advanced suitable protocol for high WDM rates, known as digital wrapper cf J. Ballintine, “A Proposal Implementation for a Digital Wrapper for OCh Overhead,” ANSI T1X1.5/99-003, January 1999, http://www.t1.org/index/0816. This prior art protocol is able to detect 16 errors and correct 8 errors. Although digital wrapper greatly improves the BER, it also consumes bandwidth by approximately 7% and suffers from related delay.
For all-optical transparent networks, although there are a few prior art papers on fault protection and restoration, there are only a few on fault localization. Introduced fault localization protocols consider different aspects of fault localization such as signalling, alarming, monitoring, detecting, filtering, and also regard various topologies. For instance, in cf A. V. Sichani and H. T. Mouftah, “A Novel Broadcasting Fault Detection Protocol in WDM Networks,” Proceedings of QBSC 2004, pp. 222-224, May 2004, a fault localization method named broadcasting fault detection protocol is proposed that localizes failures by propagating controlling and localizing signals through the supervisory channels. However, the controlling bandwidth usage is considerable. Rolling back protocol is proposed in A. V. Sichani and H. T. Mouftah, “Rolling Back Signaling Protocol—A Novel Fault Localization Protocol for WDM Mesh Networks,” CIC China Communications Magazine, vol. 1, no. 1, pp. 101-105, December, 2004 for fast fault localization, which reduces the number of controlling signals in the supervisory channels. While this protocol significantly decreases the traffic in the controlling network but its implementation demands adding more monitoring components to the network.
The work in A. G. Hailemariam, G. Ellinas, and T. Stern, “Localized Failure Restoration in Mesh Optical Networks,” Proceedings of IEEE OFC'04, pp. 23-27, February 2004 partitions the network into sub-networks called “islands” and discovers a faulty situation that is “a node or link failure” by an island-by-island restoration protocol. Island identification is an off-line procedure, executed during network planning, and occasionally updated when the network topology is changed. It is claimed in P.-H. Ho, J. Tapolcai, T. Cinkler, “Segment Shared Protection in Mesh Communications Networks with Bandwidth Guaranteed Tunnels,” IEEE/ACM Transactions on Networking, vol. 12, no. 6, pp. 1105-1118. December 2004 that the protocol outperforms segmented restoration protocol in terms of time delay, overhead and complexity.
In order to reduce the number of generated failure alarms, another approach introduced in S. Stanic, S. Subramaniam, H. Choi, G. Sahin, and H. Choi, “On Monitoring Transparent Optical Networks,” Proceedings of IEEE ICPPW'02, pp. 218-223, August 2002 optimizes the number of monitoring components using an alarm matrix.
Another research work, H. Zeng, C. Huang and A. Vukovic, “Monitoring Cycles for Fault Detection in Meshed All-optical Networks,” Proceedings of International Conference on Parallel Processing/International Workshop in Optical Networks and Management (ICPP/ONCM'04), pp. 434-439, August 2004, proposes a framework for fault detection using a set of monitoring cycles.
A fault location algorithm is proposed in C. Mas and P. Thiran, “An Efficient Algorithm for Locating Soft and Hard Failures in WDM Networks,” IEEE Journal on Selected Areas in Communications, vol. 18, no. 10, pp. 1900-1911, October 2000, that operates in the physical layer. This protocol is capable of localizing multiple failures and filtering false alarms. However, the time and space complexity of protocol could be considerable for large networks.
A finite state machine method is proposed in C.-S. Li and R. Ramaswami, “Automatic Fault Detection, Isolation, and Recovery in Transparent All-optical Networks,” IEEE Journal of Light wave Technology, vol. 15, no. 10, pp. 1784-1793, October 1997, but its computational complexity for large-scale and dynamic networks impedes its deployment. There is also a proposal on employing GMPLS for fault detection on all-optical networks, “Generalized Multi-Protocol Label Switching (GMPLS) Architecture,” The Internet Engineering Task Force (IETF), RFC 3945, October 2004, http://www.faqs.org/rfcs/rfc3945.html.
Therefore, in view of the aforementioned prior art, the present invention seeks to provide a highly fast and efficient method and system for fault localization in a data network of nodes through use of a link restoration protocol.