The present invention relates to data networking and more particularly to systems and methods for providing fault tolerance to data networks.
As the Internet becomes a multi-media communications medium that is expected to reliably handle voice and video traffic, network protocols must also evolve to support quality-of-service (QoS) requirements such as latency and reliability and to provide guaranteed available bandwidths. One form that this evolution is taking is the advent of MPLS (Multi-Protocol Label Switching) Traffic Engineering which may be supplemented by Diffserv-aware Traffic Engineering. Rather than using conventional IP routing techniques where individual packets travel through the network following paths determined individually for each packet as it progresses through the network, MPLS Traffic Engineering exploits modem label switching techniques to build guaranteed bandwidth end-to-end circuits through a network of label switched routers (LSRs). MPLS has been found to be highly useful in establishing such circuits also referred to as label switched paths (LSPs). MPLS networks employing LSPs can more easily interoperate with other IP-based networks than other virtual circuit-oriented networks employing, e.g., ATM or Frame Relay. Networks based on MPLS Traffic Engineering, especially those supplemented with DiffServ-aware Traffic Engineering are very effective in handling delay and jitter-sensitive applications such as voice over IP (VoIP) and real-time video.
Meeting the demands of businesses and consumers, however, also requires that bandwidth and latency guarantees continue to be met when links or nodes fail. When failure of a link or a node causes the failure of an LSP, the standard routing protocols such as constraint-based shortest path first (CSPF) are too slow to be used for dynamic rerouting of QoS-sensitive traffic. In optical networks employing SONET, fast restoration can be provided by means of features incorporated into the SONET protocol. However, where such techniques are not available, other protection mechanisms become necessary to ensure that services are restored within a sufficiently short time, e.g., 50 ms, such that the user experience is not affected.
To address this requirement, various fast reroute techniques have been developed that provide rapid reaction to failure of a link or node such that the user experience is preserved. In one such approach, individual nodes and links are protected against failure by establishing local backup tunnels (also implemented as LSPs) that are used to reroute all traffic around the failure. To protect a link, a backup tunnel is established connecting the two nodes that the protected link connects without including the protected link in the backup tunnel. To protect a node, a backup tunnel protects each pair of links traversing the node. If bandwidth protection is desired, each backup tunnel should have an allocated bandwidth.
Certain problems arise in implementing this backup scheme. To guarantee quality of service under failure conditions, the backup tunnel should have at least as much bandwidth as the primary bandwidth of the protected element (e.g., link or node in this context) or alternatively, at least as much bandwidth as consumed by LSPs that employ the protected element. However, it may be impossible to find a series of links to make up a single backup tunnel where each link has the required bandwidth. This is particularly true when network bandwidth is generally scarce.
Another concern is inefficient use of backup tunnels to protect parallel links that would fail together due to, e.g., a fiber cut, or parallel link pairs that would fail due to a node failure. One prior art approach allocates a separate backup tunnel to protect each link or path, or even to protect a single LSP, wasting valuable network resources such as router state, signaling resources, etc. Another prior art approach creates m backup resources to protect n primary resources but this approach is based on an assumption that only m of the n resources can fail simultaneously.
By virtue of one embodiment of the present invention, load balancing among fast reroute backup tunnels in a label switched network is achieved. M backup tunnels may be used to protect N parallel paths all of which can fail simultaneously. A single backup tunnel may protect multiple parallel paths, saving on utilization of network resources such as router state and signaling information. A single path may be protected by multiple backup tunnels, assuring that bandwidth guarantees are met under failure conditions even when no one backup tunnel with sufficient bandwidth may be found. A packing algorithm is used to associate individual label switched paths (LSPs) with individual backup tunnels.
When there is no possible assignment of LSPs to backup tunnels that provides sufficient backup bandwidth for each LSP, a new primary LSP may be either rejected, or alternatively a new backup tunnel may be established for the new LSP, or the bandwidth of the existing backup tunnels may be increased.
One aspect of the present invention provides a method for providing fast reroute protection in a label switched network. The method includes: identifying N paths to be protected together in the event all of them fail at the same time, the N paths originating at a first selected node of the label switched network and terminating at a second selected node of the label switched network, identifying M backup tunnels to protect the N selected paths, and selecting for each of a plurality of label switched paths employing any of the N selected paths, one of the M backup tunnels as a backup to use upon failure. N or M is greater than or equal to 2.
Further understanding of the nature and advantages of the inventions herein may be realized by reference to the remaining portions of the specification and the attached drawings.