Modern communication networks are required to provide Quality of Service (QoS) assurance and fault resilience, while maximizing their utilization. This prompts the need for new dynamic restorable routing algorithms that exploit restoration resource sharing for achieving these goals. The best possible sharing can be obtained when the routing algorithms have complete information of all primary and restoration paths. However, representing this information can be quadratic in the number of links, making it impractical in large networks. Recently, M. Kodialam and T. V. Lakshman, “Restorable Dynamic Quality of Service Routing,” IEEE Communications Magazine, vol. 40, no. 6, pp. 72-81, 2002, also described in U.S. Pat. No. 6,584,071, (both of which are incorporated by reference herein in their entireties) have shown that significant resource sharing can be obtained when only partial information is available. Their work inspired the development of new routing algorithms that allow for considerable resource sharing. However, these algorithms do not provide any guarantee on the quality of the solutions.
Quality of Service (QoS) assurance and fault resilience have become fundamental requirements from modern communication networks such as MPLS-based IP networks, ATM and optical networks. These requirements are essential for supporting new real-time applications such as video conferencing and multimedia streaming. In current networks, resilience to failures is obtained by providing primary and restoration paths between each source-destination pair. QoS is guaranteed by allocating enough network resources, in terms of bandwidth and buffer space, along these paths. This calls for establishing efficient restorable routing mechanisms for providing adequate primary and restoration paths while maximizing the network utilization.
There is a rich body of work in the areas of network survivability and restorable routing, teaching that restorable routing mechanisms are required to provide resilience only for a single element failure (link or node) as these failures are rare events and the probability of two simultaneous failures is very low. Consequently, most current routing mechanisms provide fault tolerance by the means of 1+1 or 1:1 protection. In a 1+1 protection scheme, two disjoint paths are provided and data are sent on both of them. The receiver uses the data that arrive on one of them (according to various criteria) and discards the data of the other path. This approach guarantees a fast recovery in case of failure, as the receiver just needs to switch from one path to the other. However, this fast recovery is obtained at the expense of low network utilization, since more than twice the required network resources are allocated to each connection. In a 1:1 protection scheme, the routing mechanism also provides two disjoint paths, but, the data are sent only on one path, termed the primary path, and the restoration path is activated (by signaling) only in case of failure. Because protection is only sought for a single element failure, restoration resources can be shared by multiple restoration paths as long as their primary paths are not susceptible to the same failures, i.e., there are disjoint primary paths. To summarize, these two approaches present a clear tradeoff between recovery time and network utilization.
H. Hwang, S. Ahn, Y. Choi, and C. Kim, “Backup Path sharing for Survivable ATM Networks,” in Proceedings of ICOIN-I2, January 1998, and M. S. Kodialam and T. V. Lakshman, “Dynamic Routing of Bandwidth Guaranteed Tunnels with Restoration,” in Proceedings of IEEE INFOCOM '2000, Tel-Aviv, Israel, March 2000 teach the ability to improve network utilization by using shared restoration resources, have prompted a search for restorable routing algorithms that exploit resource sharing for maximizing the network utilization, while still providing QoS guarantees. Generally speaking, the proposed methods can be divided into two categories.
Several papers consider the network design problem of providing an overlay topology with minimal allocated resources that provides restoration and QoS assurance for a known set of connection requests, where each request comprises a source-destination pair and a bandwidth demand. These include: Y. Liu, D. Tipper, and P. Siripongwutikorn, “Approximating Optimal Spare Capacity Allocation by successive Survivable Routing,” in Proceedings of IEEE INFOCOM '01, Anchorage, Ak., April 2001; O. Hauser, M. Kodialam, and T. V. Lakshman, “Capacity Design of Fast Path Restorable Optical Networks,” in Proceedings of IEEE INFOCOM '02, New York, N.Y., June 2002; C. Chekuri, A. Gupta, A. Kumar, J. Naor, and D. Raz, “Building Edge-Failure Resilient Networks,” in Proceedings of IPCO 2002, May 2002; and G. Italiano, R. Rastogi, and B. Yener, “Restoration Algorithms for Virtual Private Networks in the Hose Model,” in Proceedings of IEEE INFOCOM '02, New York, N.Y., June 2002.
Another set of papers addresses the issue of dynamic restorable routing with shared resources. Here, the system does not have a priori knowledge of the connection requests and these studies propose different on-line routing schemes that handle the connection requests one by one. These schemes are required to minimize the amount of new allocated resources per request (the so called connection cost), while insuring QoS requirements and fault tolerance. X. Su and C.-F. Su, “An Online Distributed Protection Algorithm in WDM Networks,” in Proceedings of IEEE ICC '01, June 2001; S. Sengupta and R. Ramamurthy, “Capacity Efficient Distributed Routing of Mesh-Restored Lightpaths in Optical Networks,” in Proceedings of IEEE GLOBECOM '01, November 2001; E. Bouillet, J.-F. Labourdette, G. Ellinas, R. Ramamurthy, and S. Chaudhuri, “Stochastic Approaches to Route Shared Mesh Restored Lightpaths in Optical Mesh Networks,” in Proceedings of IEEE INFOCOM '02, New York, N.Y., June 2002; G. Li, D. Wang, C. Kalmanek, and R. Doverspike, “Efficient Distributed Path Selection for Shared Restoration Connections,” in Proceedings of IEEE INFOCOM '02, New York, N.Y., June 2002.
One criterion for distinguishing between dynamic routing schemes is the information available to the algorithm. Kodialam and Lakshman show that the level of sharing depends on the kind of link-usage information that is available to the routing algorithm, and lead them to introduce three different information models. The first model is called the no information model, in which only the capacity and the total reserved bandwidth on each link are known and there is no knowledge available on restoration resources. Consequently, the routing algorithm cannot exploit resource sharing and the amount of allocated resources is the same as in the case of 1+1 protection. The second model is called the complete information model, since the routes of the primary and the restoration paths of every connection are known. This information permits the best possible sharing. However, in large networks this is not practical due to the large amount of required information. The third model is an intermediate one and it is called the partial information model since the system keeps (slightly) more information than the no-information model. In this model, rather than knowing only the total reserved bandwidth of each link, the system is aware of the total bandwidth reserved for both primary paths and the bandwidth allocated for restoration on each link. Simulations show that with only a modest amount of information, the partial information model achieves significant resource sharing, and in many cases, an exact bandwidth reservation is made on each link. This makes the performance of the partial information model close to the ideal complete information case.
C. Qiao and D. Xu, “Distributed Partial Information Management (DPIM) Schemes for Survivable Networks—Part i,” in Proceedings of IEEE INFOCOM '02, New York, N.Y., June 2002; and D. Xu, C. Qiao, and Y. Xiong, “An Ultra-Fast Shared Path Protection Scheme—Distributed Partial Information Management, part ii,” in Proceedings of ICNP '02, Paris, France, 2002, presented a distributed partial information management (DPIM) framework for restorable routing and described both IP-based and fast heuristic-based routing algorithms. By simulations, the authors have shown that these schemes perform better than the previous partial information schemes.
In Y. Xiong, C. Qiao, and D. Xu, “Achieving Fast and Bandwidth-Efficient Shared-Path Protection,” Journal of Lightwave Technology, vol. 21, no. 2, pp. 365-371, 2003, the authors observed that the DPIM framework tends to select long restoration paths over links with high restoration bandwidth. This phenomenon affects the recovery time and the paper presents a method for balancing between resource sharing and the restoration path length.
M. Kodialam and T. V. Lakshman, “Dynamic Routing of Locally Restorable Bandwidth Guaranteed Tunnels Using Aggregated Link Usage Information,” in Proceedings of IEEE INFOCOM '01, Anchorage, Ak., April 2001, extends the authors' schemes for local restoration, where each link or node in the primary path is protected by a bypass backup path. The paper presents a local restoration routing scheme that reduces the system recovery time while obtaining the network utilization similar to the global restoration approach.
All the above-mentioned studies have illustrated by simulation that using resource sharing significantly improves the network utilization in both the complete and the partial information models. However, the proposed schemes do not provide any guarantee on the quality of the calculated solutions.