All-optical networks using wavelength division multiplexing (WDM) are increasingly being deployed for a wide variety of communication applications. WDM techniques allow optical signals having different wavelengths to be multiplexed into a single optical fiber. Current WDM deployments allow multiplexing of up to about 16 different wavelengths on a single fiber, but systems multiplexing 32 or more different wavelengths on a single fiber are expected to become available soon. Each of the wavelengths serves as an optical carrier and can be used independently of the other wavelengths, such that different wavelengths may use different modulation formats to carry different signal types. In a simple example, each wavelength may carry a modulation signal representing a synchronous optical network/synchronous digital hierarchy (SONET/SDH) client payload, where each client is a SONET-rate TDM application and the common carried signals are in an OC-48 or an OC-192 format.
FIG. 1 shows a conventional optical routing device 10 which includes a wavelength selecting cross-connect (WSCC) 12, two input optical fibers 14-1, 14-2 and two output optical fibers 14-3, 14-4. The routing device 10 in this embodiment is configured to route incoming optical signals at wavelengths .lambda..sub.1 and .lambda..sub.2 on fiber 14-1 to output fibers 14-4 and 14-3, respectively, and to route incoming optical signals at wavelengths .lambda..sub.1 ' and .lambda..sub.2 ' on fiber 14-2 to output fibers 14-4 and 14-3, respectively. The WSCC 12 thus serves to cross-connect incoming wavelengths on a given input fiber to different output fibers, but does not provide any transformation in wavelength. When only this type of routing device is present in an optical network, the network typically routes a given end-to-end demand using a single wavelength. If a primary network path assigned to the given demand fails, the demand generally must be carried on a secondary or restoration path using exactly the same wavelength as the primary path.
FIG. 2 illustrates an optical network 20 in which wavelength transformations may be provided for signals traversing the network, but only at the interface between a client and the optical network. A first client equipment (CE) device 18-1 communicates with a second CE device 18-2. The first CE device 18-1 uses wavelength .lambda..sub.1 and the second CE device uses wavelength .lambda..sub.3. The first CE 18-1 transmits a signal at .lambda..sub.1 to a wavelength adapter 22 which maps the incoming wavelength .lambda..sub.1 to an outgoing wavelength .lambda..sub.2. A wavelength adapter (WA) is a device which allows conversion of wavelength at the client-network interface. The wavelength .lambda..sub.2 is used to carry the modulation signal of CE 18-1 from an access node 24 of network 20 to an egress node 26 of network 20. The egress node 26 delivers the .lambda..sub.2 signal to a second WA 28 which maps the wavelength .lambda..sub.2 to wavelength .lambda..sub.3 for transmission to the second CE 18-2. In the event of a failure in the primary path through optical network 20 from CE 18-1 to CE 18-2, a secondary or restoration path with a different wavelength, such as .lambda..sub.4, may be used to transport the customer demand through the network 20. Other types of optical network elements combine features of the WSCC 12 of FIG. 1 and the WAs 22, 28 of FIG. 2. For example, a wavelength interchange device may be used to cross-connect incoming wavelengths onto different output fibers while also providing transformation of wavelengths. Such devices are called wavelength interchanging cross-connects (WICCs).
An important issue in the design of large-scale optical networks including WSCCs, WAs, WICCs and other optical signal routing devices relates to traffic restoration in the event of a failure in a link, span or node. A simplistic approach to restoration in an optical network is to provide complete redundancy, such that the network includes a dedicated back-up or secondary connection for each primary connection of the network. When a link, span or node of the primary connection fails, traffic may then be switched onto the corresponding elements of the secondary connection. Unfortunately, this approach uses a large amount of restoration capacity and therefore may be undesirable in many networks. More sophisticated approaches involve the use of a path restoration algorithm to provide automatic restoration of network traffic in the event of a primary path failure, while sharing restoration capacities whenever possible.
It should be noted that large-scale optical networks typically include a large number of spans, and two different point-to-point links may share a common span section. FIG. 3 illustrates a shared span section in a portion of a network including nodes A, B and C. The dotted lines AC and AB represent two distinct optical links. The physical layout, shown by solid lines, is such that both of these links share the span AS. If this span fails due to a fiber cut or other problem, then both the links AC and AB will fail. Thus a demand using link AB on its primary path cannot be restored on a route using link AC. It is therefore important that a given restoration algorithm achieve restoration of network traffic in the event of span failures as well as link failures, by providing distinct spans and links for the restoration path. Furthermore, to decrease vulnerability of the network to node failures, it is also desirable to perform automatic restoration in the event of single node failures. Thus the overall goal of an effective restoration algorithm should be to perform automatic restoration in the event of single link, span or node failures. The term "automatic" connotes restoration by control computers in the network, rather than by manual intervention, thus permitting fast restoration.
FIG. 4 shows a portion of a network including nodes A, B, C and D providing a bidirectional path at a wavelength .lambda..sub.1 between a first CE 18-1 and a second CE 18-2. In simple optical networks, failures are generally discovered through signal strength measurements, which may be collected for each individual wavelength at each node of the network. If a link failure occurs between nodes B and C as shown in FIG. 4, the bidirectional nature of the path allows each of the nodes A, B, C and D to detect a loss of signal (LOS) condition, but, with only the LOS information, none of these nodes will know the exact location of the failure. As a result, local restoration around the failed link, by the nodes connecting the failed link, is generally not possible, assuming that the optical network under consideration does not employ any other mechanism to isolate failures. This inability to determine the exact location of the failure from LOS information also requires that the restoration path be disjoint from the primary path. Depending on whether a network includes WSCCs, WAs or WICCs, additional restrictions may be imposed on the restoration and primary paths of a demand. The network path of FIG. 4 includes only WSCCs, and the secondary or restoration path therefore must have the same wavelength as the primary path. As previously noted, more complex networks such as the network of FIG. 2 also include WAs, such that the restoration path could have a different wavelength than the primary path, although the same wavelength generally must be used from an access node such as node 24 of FIG. 2 to an egress node such as node 26. FIG. 5 shows a more general situation in which nodes A, B, C and D between source CE 18-1 and destination CE 18-2 each include WICCs. The FIG. 5 network thus permits local wavelength transformations at each node, such that the path from the access node A to the egress node D need not be at a single wavelength.
Path restoration techniques may be classified in many dimensions. A first classification is based on where the paths are computed. In that sense, the restoration computation may be either centralized or distributed. In the former, the restoration path computation is done at a central controller which has global information regarding the network. In distributed restoration, each node computes the restoration paths for demands passing through that node. Another classification of path restoration techniques depends on where the restoration action is implemented. In particular, the restoration may be local, end-to-end path based or hybrid. In local restoration, the nodes closest to the point of failure initiate restoration action for all demands affected by a given failure. In end-to-end path based restoration, the source-destination node pairs of demands affected by a given failure initiate the restoration action. Hybrid restoration approaches have aspects of both local and end-to-end path-based restoration in that they seek to find the best restoration path, in terms of minimizing the required spare capacity, that is closest to the point of failure.
Restoration techniques may also be differentiated by the time at which the restoration paths are computed. Discovery-based approaches determine restoration paths after a failure event has occurred, while precomputed approaches determine restoration paths before the failure event and the failure event merely triggers the activation of the precomputed paths. Discovery-based approaches may be centralized or distributed, but their defining characteristic is that they compute restoration paths in real time, after the failure occurs. Centralized discovery-based approaches use some mechanism (e.g., alarms) such that the network elements detecting failures can communicate to the central controller, which then computes the best available paths. In the distributed discovery-based approach, as soon as a failure event occurs, the nodes affected by the failure need to find out where spare capacity is available, and to create restoration paths by reserving available spare capacity on selected paths. If two requests contend for the same spare capacity, then some form of contention-resolution procedure is needed to resolve the contention. The capacity search procedure, including contention resolution, is performed after the failure but before the demands affected by a failure can be rerouted. As a result, for distributed discovery-based restoration, the restoration times tend to be large and/or the spare capacity utilization is poor. Moreover, many of the constraints imposed by optical networks do not allow implementation of simple distributed discovery-based approaches to restoration.
A prior art centralized precomputation technique is described in J. Anderson, B. T. Doshi, S. Dravida and P. Harshavardhana, "Fast Restoration of ATM Networks," JSAC 1991, which is incorporated by reference herein. In centralized precomputation, a central controller in the network stores information on the entire network topology as well as capacities of all links in the network. This controller runs an optimization algorithm with the objective of computing alternate paths for every possible failure in the network while utilizing minimum redundant capacity, and routing tables specifying these alternate paths are downloaded to the appropriate network elements. When a failure is detected by a network element, it activates the corresponding alternate routing table. Similar action is taken by all the network elements as and when the elements receive the failure information. A drawback of this approach is that it requires a central controller of substantial computing capacity and may therefore be hard to implement as the network increases in size.
A prior art distributed discovery-based computation technique is described in W. D. Grover, "The Self-Healing Network: A Fast Distributed Restoration Technique for Networks Using Digital Cross Connect Machines," IEEE Globecom 1987, and U.S. Pat. No. 4,956,835, issued to W. D. Grover on Sep. 11, 1990, both of which are incorporated by reference herein. In this approach, when a link failure is discovered, the nodes at the two ends of the failed link initiate a search for spare capacity in the network on links that are potential candidates for alternate routing. The available spare capacity is then allocated on a first-come-first-served basis by one of the nodes to which the failed link is attached. This approach suffers from a number of significant drawbacks, especially under the constraints likely to be present in typical conventional optical networks. First, the discovery of spare capacity after failure introduces at least a round-trip delay between nodes, thereby increasing restoration time. Second, even if sufficient spare capacity is available to restore all traffic affected by the failure, the fact that capacity is allocated on a first-come-first-served basis may not allow full restoration in practice. Third, these and most other distributed discovery-based techniques are fundamentally intended for restoration of single link failures in networks in which the location of the failure can be identified. This is because the node at one end of the failed link initiates capacity discovery. Since typical conventional optical networks do not have failure isolation capability, the distributed discovery-based approach will not work for such networks. Finally, the distributed discovery-based approach exemplified in the above-cited Grover references does not work well for node failures. In the case of a node failure, the burden of discovering spare capacity falls on multiple nodes, not just the nodes on the two ends of the failed link. The Grover approach generally cannot be used by multiple nodes simultaneously.
Variants of the Grover approach are described in C. H. Yang et al., "FITNESS: Failure Immunization Technology for Network Service Survivability," IEEE Globecom 1988, and C. Edward Chow, J. Bicknell, S. McCaughey and S. Syed, "A Fast Distributed Network Restoration Algorithm," IEEE Globecom '93, pp. 261-267, 1993, and S. Hasegawa, Y. Okanone, T. Egawa and H. Sakauchi, "Control Algorithms of SONET Integrated Self-Healing Networks." Unfortunately, none of these variants overcome the fundamental deficiencies of the discovery-based Grover approach. Generally, simultaneous attempts by multiple nodes to discover and reserve restoration capacity require multiple message exchanges, contention resolution and path calculation, and the variants are thus unable to avoid excessive restoration delays.
There is presently no end-to-end path restoration approach which provides a distributed precomputation technique suitable for use in an optical network. Although a distributed precomputation technique is described in U.S. Pat. Nos. 5,435,003 and 5,537,532, both entitled "Restoration in Communications Networks" and issued to R. S. K. Chng, C. P. Botham and M. C. Sinclair, this distributed precomputation technique is not well-suited for use in an optical network which includes WSCCs, WAs, WICCs or other typical optical routers such as those described in conjunction with FIGS. 1, 2, 4 and 5 above. The technique precomputes alternate paths for certain failure scenarios. After a failure occurs, a pair of end nodes affected by the failure attempt to find alternate paths in real-time. If a precomputed restoration path exists, the end nodes switch traffic to the precomputed path while the real-time paths are being computed. Once the best real-time path is computed, if the end nodes determine that the real-time path is better than the precomputed path, they switch traffic to the real-time path. This technique suffers from a number of drawbacks which render it of limited utility in an optical network. For example, the technique fails to address the possibility that a path computed in real-time for one failure scenario may overlap with precomputed paths for another scenario. Also, a path computed for one demand in real time may overlap with a precomputed path for another demand. There is no procedure for resolving conflicts between demands contending for restoration capacity on the same link, either during precomputation or during real-time computation. In addition, the technique does not support the use of failure-disjoint alternate paths for situation s in which fault isolation is not possible. Moreover, the technique provides no resource optimization other than that of picking the best path among a set of ad hoc paths. These and other deficiencies of the approach of U.S. Pat. Nos. 5,435,003 and 5,537,532 render it of limited value in a complex, large-scale optical network. The prior art thus fails to provide a distributed precomputation restoration approach which provides acceptable performance in a large-scale optical network.
It is therefore apparent that a need exists for improved network restoration techniques which utilize distributed precomputation to provide path restoration in large-scale optical networks after link, span or node failures, while avoiding the problems associated with the above-described conventional restoration techniques.