Developments in DWDM-based switching technology are giving rise to networking elements that are capable of manipulating individual lightwave carriers or wavebands in ways that are logically similar to SONET-era add-drop multiplexers and cross-connects in terms of the agility they provide for reconfiguration of the transport layer. Like SONET elements that add-drop or cross-connect individual STS-1 or STS-n tributaries, Optical ADMs (OADMs) and Optical cross-connects (OCX) can add/drop or cross-connect wavelengths (or wavebands) [1]. All references in square brackets are listed at the end of the disclosure. One advantage of these DWDM networking elements is that they provide the reconfigurability to adapt the logical wavelength connectivity layer to match changing demand patterns in the service layers enabling the concept of an “automatically switched” (a.k.a “self-organizing”) transport network (ASTN) [2], [3]. But another advantage is that OADM and OCX elements enable mesh restoration schemes for the optical networking layer.
One driver for optical layer mesh restoration over the ring protection schemes of Sonet is the greater capacity efficiency that can be achieved [11]-[24]. Mesh networking allows routing of the working demands over shortest paths of the facilities graph and greater efficiency in the sharing of spare capacity for restoration. In practice, however, some real networks are so sparse in their facility-route topology that it may still be hard for mesh-based restoration to prove-in over a ring-based solution which is less capacity efficient but is based on less-costly OADMs rather than OCX. The emphasis on “low-connectivity” graphs reflects the reality of several North American Inter-exchange carrier (IXC) networks. While European networks often have d>4, (see for example the networks in [10], [12]), North American IXC networks can be extremely sparse, with d as low as 2.2 (see for example [6]). d is the average number of separate facility routes leaving each node.
In a bi-connected network with d only slightly above two there will be a preponderance of degree-2 locations and will tend to contain chain sub-networks, like beads of a string. FIG. 1 is a conceptual example of such a sparse facility topology. The example is illustrative only, but to varying extents is characteristic of the North American portions of the networks described at [4] through [10]. At least empirically it is well recognized that North American networks, especially in Canada and over large parts of the mid-U.S.A., tend to be of lower degree than European networks. This is perhaps because, per-unit of geographical area, there have been fewer revenue producing source/sink centers in these regions to justify the historical development of a richer fabric of direct facility routes at the continental scale. And more recently, advances in transmission capacity, and related economy-of-scale in capacity-cost effects, only serve to reinforce the tendency towards sparse facility graphs [25]. With large amounts of capacity and economy of scale it can often be economic to route longer distances, over sparser graphs, rather than seek additional facility routes, at least as a short-term recourse to meeting demand. There is thus a practical reason to be interested in transport network research that is especially focussed on sparse transport graphs. The extent to which ring-based networks have been deployed at the IXC level in North America compared to Europe, is in a sense also a recognition of this sparseness in that rings are easily mapped onto these natural chains. However, rings have to be closed to operate, whereas a set of chain sub-networks may, as proposed herein, be operated at a higher (a meta- level) as a form of mesh-restorable network.
On the other hand, a very sparse graph can make the economic advantage of mesh-based networking questionable. For a few years now informal appraisals have often judged that a network as sparse as in FIG. 1 would be simply too low-degree to benefit enough from mesh restoration (relative to a ring-based status quo). After all, mesh efficiencies can only possibly occur at nodes with d=3 or higher: a d=1 node is not restorable and a d=2 node is already as well served by a shared-protection (BLSR type) ring as it can be. And increasing d by simply acquiring more rights-of-way is generally a most long-term and expensive proposition. Right-of-way costs can be one of the single largest investments the network operator faces involving years of legal work to piece together individual purchases, leases, municipal approvals, permits, and so on, to establish one new edge in the facilities graph. An object of this invention, therefore, is to enhance the efficiency of span-restorable mesh networks on low-degree topologies.
Definitions
The most common practical aim in the design of survivable transport networks is to achieve 100% restorability against any single span failure either through network protection or restoration using a designed-in allocation of spare capacity. We use the term spare to denote any such designed-in reserve capacity whether technically for protection or restoration. Generally protection is used for schemes where the spare capacity is reserved and dedicated to cover a specific set of failure scenarios such as in 1+1 diverse-routed protection, or path- or line-switched rings. Restoration refers to arrangements where a network-wide allocation of spare capacity is not dedicated to any specific failure but is configured as needed to restore affected carrier signals as failures arise. Restoration schemes can generally achieve higher sharing of spare capacity than a corresponding protection scheme, but may require a more complex real-time process for the failure recovery.
Designing for 100% restorability means that all of the failed working demand units, in this case traffic-bearing lightwave links forming parts of end-to-end lightpaths, can be restored by replacement paths either end-to-end across the network or through detour-like path segments formed between the end-nodes of the failed span itself. The required replacement paths must be feasible for every single-failure scenario within the environment of spare wavelengths surviving after the failure. An obvious aim in designing any survivable mesh network is therefore to assure that all such restoration path-sets are feasible within a globally minimized total amount of spare capacity. Every span in a mesh-restorable network has a number of working capacity units and a designed-in number of spare-capacity units. In DWDM networking the units of both working and spare capacity are individual DWDM carrier wavelengths. The spare capacity on a span is not, however, for restoration of demands crossing the same span, but is for shared use in restoration routing for other span failures. Spare capacity is in every way identical to working capacity but it bears no actual traffic (or any such traffic is preemptible) when in the standby state. Each spare wavelength is also fully ready for use but is not yet cross-connected into any lightpath in the non-failure state.
The term span as used here has its origin in the transmission networking community to refer to a grouping of physical layer carrier signals between adjacent cross-connecting nodes that can undergo a common-cause failure. As Bhandari [13] explains “ . . . spans are the set of physical transmission fibers/cables in the physical facility graph. Links of the logical connectivity graph are built from spans. A given span can thus be common to a number of links.” A span is further defined by us as constituting the set of all the physical working and spare channels that terminate on adjacent cross-connecting nodes and share a common exposure to a single physical cut of their infrastructure, such as a duct or cable. Each working capacity unit on a span is thus part of a logical link in a client service-layer network, all such links being destined to fail together if the corresponding physical span fails. A span is thus like the more recent concept of shared link risk group (SLRG). One physical entity failure may also produce one or more simultaneous span cuts if more than one cross-connect adjacency is involved. Notwithstanding the specific meaning of span here, readers are advised that the more generic term link is often also used in this context. The intended meaning of link as either a service-layer or physical-layer entity has to be construed appropriately in each case.
Reversion is the process of returning affected demand flows back to their pre-failure routes from their restoration routes after physical repair of the failed span. In all cases which follow, other than with dedicated 1+1 APS protection, we are designing capacity for networks in which reversion is assumed to occur following a failure and its subsequent repair before there is any significant probability of a second failure onset. Mesh-restorable networks can be designed to sustain a second span failure while repair of the first failure is ongoing but the spare capacity penalty can be very high [14] and this is not generally the aim in the practical design of transport networks. It is, however, assumed that in networks where spare capacity is available for either restoration or new service provisioning, ongoing provisioning of new service paths during the restored state will have to be cognizant of the spare capacity used by the restoration process and provision new service paths accordingly. An alternative, however, is to operate a transport network with an envelope of working capacity, within which self-organizing ASTN-type service provisioning is conducted with a separate allocation of spare capacity for assured restoration of any single span failure within the working envelope. When it is the working envelope itself that is protected, ASTN operations can remain blind to the details of the failure and restoration reconfiguration.
The generic term demand refers to a working unit of aggregated traffic to be transported between origin-destination (O-D) nodes of the network. The term follows Wu's distinction between traffic itself and the demand units [15] required to transport it. Traffic for example is the individual IP packet and or STS-level tributary flows exchanged between O-D pairs. But demand expresses the aggregate requirement of all traffic types for lightpaths between a given O-D pair. One unit of demand consumes one working wavelength on each span traversed on the route of the demand between O and D.
Loop-Back in Restoration Schemes
The simplest form of network protection is diverse-routed 1+1 automatic protection switching (APS) with a dedicated span- (or node-) disjoint protection (DP) path. 1+1 DP APS uses simple terminals but requires over 100% redundancy in terms of total wavelength-kms required. By the redundancy of a span or a network as a whole, we mean the ratio of total spare to total working capacity. Optical path-protection rings (OPPR) and Optical shared protection rings (OSPR) [16] are the WDM-based counterparts to SONET UPSR and BLSR. The OPPR/UPSR structure is a logical collection of tributary-level 1+1 DP setups that is no more architecturally efficient than 1+1 APS, but is economically efficient because of the economy of scale in sharing of the optical line transmission capacity, and because of the relative simplicity of the OADM terminals. The OPSR/BLSR structure is more efficient than 1+1 DP APS or OPPR/UPSR because it uses a line-level loop-back mechanism, allowing sharing of protection capacity over all spans of the same ring. However, the best an OPSR/BLSR ring can do is achieve 100% redundancy because the protection capacity around the entire ring must meet the largest cross-section of working capacity anywhere in the ring.
This 100% matching of spare capacity to largest-working capacity is a general property of any degree-2 sub-network such as a ring or a chain of degree-2 nodes. A ring is just a sub-network of degree-2 nodal elements arranged in a cycle on the graph, while a chain is a connected segment of degree-2 nodes that does not close on itself. Loop-back refers to the mechanism and the spare capacity requirements required for restoration routing in either a BLSR ring, or in a chain under span restoration. The main point to observe is that at any degree-2 site the spare capacity on the “East” side of the node must meet or exceed the working capacity on the “West” side of the same node, and vice-versa. The topology of a ring or chain dictates that to escape from a cut on one-side of a node, the spare capacity on the other side must be sufficient to support loop-back of the failed working capacity on the cut side.
Mesh Restoration and Protection Schemes
Span restoration is the mesh technology equivalent to OPSR and BLSR rings in that restoration occurs by rerouting between the immediate end nodes of the break. Span restoration is like deploying a set of detours around the specific break in a road that disrupts working paths. Unlike rings, however, mesh span restoration need not be via a single route, nor via simple two-hop routes only. By analogy, if a highway has several lanes, there may be an independent detour path deployed for each lane limited by a hop or distance limit, H, which can be considerably more than two hops. The basic re-routing and capacity design methods for span restoration can incorporate a hop or distance limit and/or an optical path loss limit. Setting the hop or distance limit allows a trade-off between the maximum length of restoration paths and the total spare capacity. As H is increased, more sharing-efficient patterns of re-routing are permitted until at a threshold hop limit H*, the theoretical minimum of spare capacity is reached [20].
For comparisons of the restoration system of the present invention to existing schemes, we consider two variants of the span restoration capacity design problem. In the Spare Capacity Assignment (SCA) problem we consider span-restorable networks in which demands are first shortest-path routed followed by optimal spare capacity assignment for 100% restorability. The total spare capacity is minimized independently of working capacity. In Joint Capacity Assignment (JCA) we consider span-restorable networks where the routing of working paths (and hence working capacity) is jointly optimized with spare capacity assignment to minimize total capacity. Self-organizing methods for this type of restoration, including distributed self-planning, are well developed from work in the 1990s [17], [18], and [32]. Although phrased in the language of the times, i.e., SONET, these schemes are fairly easily mapped into DWDM implementations between opto-electronic cross-connects, especially if digital wrapper [36] is implemented. Alternately, centralized control or OSPF-type path finding may be iterated to develop a set of k-shortest replacement paths for this type of restoration.
Shared backup path-protection and path-restorable networks are also considered here. In Shared Backup Path Protection (SBPP) we assume the shortest route is used for the working path and a single fully-disjoint route is selected for the backup path under optimization to permit sharing of spare capacity over all backup paths whose working paths are failure-disjoint. Demands on working paths that follow physically disjoint routes over the network will not need the restoration capacity simultaneously, hence restoration capacity sharing is permitted. This is logically the same scheme as was proposed for ATM Backup VP restoration [30] in the special case where the maximum permissible over-subscription factor [23] is limited to 1.0. The SBPP approach is receiving much attention in recent IETF deliberations [31]. SBPP is sometimes called failure-independent path protection because the route of the backup path is the same regardless of where a failure arises on the corresponding working path. This is argued to simplify activation and speed up cross-connection of the backup path. But it foregoes the opportunity in capacity planning to re-use the surviving “stub” portions of the failed path either for the same working demand or for restoration of any other demands that underwent simultaneous failure in the corresponding span cut.
In a path-restorable mesh network [21]-[22] demands affected by a span failure are restored simultaneously on an end-to-end basis for each O-D pair affected. This is done in a globally optimized manner that considers the specific failure and can exploit surviving stub capacity from failed working paths using stub release [22]. In a path-restorable network the total spare capacity is strictly sufficient only to support a multi-commodity maximum-flow (MCMF) type of simultaneous re-routing of all affected O-D pairs [32]. In its most capacity-efficient form this involves stub release in which the surviving working capacity units of failed paths are considered available as spare capacity for the particular restoration event. The automatic propagation of an Alarm Indication Signal (AIS) in a digital wrapper is a simple and fast means to effect stub release. The main difference relative to SBPP is that there is no single predetermined restoration route for each working path. Rather a collectively optimized re-routing of all failed paths will occur end-to-end in the presence of the specific failure, the surviving spare capacity following that failure, and the environment of stub release capacity. The path restorable designs we consider are non-joint in the same sense as above in that demands are first routed via their shortest paths before spare capacity is optimized. Further elaboration on the concept of stub release in path restoration is available in [21]-[23]. It has also been found in [21]-[23] that joint optimization adds little further efficiency to a path-restorable design so we consider the simpler non-joint case for comparison to the performance of the present invention.
Conventional Design of Span-Restorable Mesh Networks
The design of span-restorable mesh networks is most often approached using an arc-path Integer Linear Programming (IP) formulation introduced for SCA [20]. As our benchmark here we will use an extension of the model in [20] to include joint optimization of the working path routing (i.e. JCA)[25]. We define JCA as follows:
SSet of spans in the networkPiSet of eligible routes for restoration of span iDSet of O-D pairs with non-zero demanddrNumber of demand units for O-D pair rQrSet of eligible working routes available for demand pair rζjr,q1 if qth eligible route for working demands between O-D pair ruses span j, zero otherwiseδi,jp1 if pth eligible route for restoration of span i uses span j,zero otherwiseCj/lCost of a unit-distance of unit-capacity on span jLjLength of span jƒipRestoration flow assigned to pth eligible restoration route for span iSjNumber of spare capacity units placed on span jgr,qWorking capacity assigned to the qth eligible working route fordemand pair rWjNumber of working capacity units on span j
                            ⁢                  JCA          ⁢                      :                                                      ⁢                  Minimize          ⁢                                    ∑                              j                ∈                S                                      ⁢                                          C                jil                            ·                              L                j                            ·                              (                                                      w                    j                                    +                                      s                    j                                                  )                                                                                                    (        1        )                                        ⁢                  Subject          ⁢                                          ⁢          to          ⁢                      :                                                                                                ∑                                      q              ∈                              Q                r                                              ⁢                      g                          r              ,              q                                      =                  d          r                                            ⁢                  ∀                      r            ∈            D                                              (        2        )                                                                                                              ∑                                      r              ∈              D                                ⁢                                    ∑                              q                ∈                                  Q                  r                                                      ⁢                                          ζ                j                                  r                  ,                  q                                            ·                              g                                  r                  ,                  q                                                                    =                  w          j                                            ⁢                  ∀                      j            ∈            S                                              (        3        )                                                                                                              ∑                                      p              ∈                              P                i                                              ⁢                      f            i            p                          =                  w          1                                            ⁢                  ∀                      i            ∈            S                                              (        4        )                                                                    ⁢                              s            j                    ≥                                    ∑                              p                ∈                                  P                  i                                                      ⁢                                          δ                                  i                  ,                  j                                p                            ·                              f                i                p                                                                                ⁢                  ∀                                    (                              i                ,                j                            )                        ∈                                          S                ×                S                ⁢                                  :                                 ⁢                                                                  ⁢                i                            ≠              j                                                          (        5        )            
The objective function minimizes the total cost of capacity placed on all spans in the network. Constraints (2) ensure that all working demands are routed. Constraints (3) generate the required working capacity on each span j to satisfy the sum of all (pre-failure) working demands routed over it. Constraints (4) ensure that restoration for failure of span i meets the target level of 100%. Constraint set (5) forces sufficient spare capacity on each span j such that the sum of the restoration paths routed over that span is met for failure of any span i. The largest simultaneously imposed set of restoration paths effectively sets the sj value on each span in the solution. To implement this type of formulation, one needs a pre-processing step to enumerate the sets of eligible working and restoration routes.