The present invention relates generally to data networking, and more particularly, to ensuring QoS (Quality of Service) of voice or mission critical traffic during network failure.
The Internet and IP networks in general have become enablers for a broad range of business, government, and personal activities. More and more, the Internet is being relied upon as a general information source, business communication tool, entertainment source, and as a substitute for traditional telephone networks and broadcast media. As the Internet expands its role, users become more dependent on uninterrupted access.
To assure rapid recovery in the event of failure of a network link or node, Fast Reroute (FRR) techniques have been developed. In a network employing Fast Reroute, traffic flowing through a failed link or node is rerouted through one or more preconfigured backup tunnels. The preconfigured backup tunnels facilitate a key goal of Fast Reroute techniques, the redirection of interrupted traffic within tens of milliseconds. This minimizes impact on the user experience. The Fast Reroute techniques have been developed in the context of MPLS (Multiprotocol Label Switching) where traffic flows through label switched paths (LSPs). When an element such as a link or node fails, all of the LSPs using that failed element are redirected through preconfigured backup tunnels that route around the impacted segments of the LSPs.
Providing strict QoS to voice during network failure still remains an open problem in large scale voice deployment where the proportion of voice traffic is high. Multiservice networks, such as those carrying telephony traffic, require very tight QoS as well as very fast recovery in case of network failure. A number of techniques, including Diffserv (Differentiated Services), MPLS Traffic Engineering, capacity planning, and RSVP (ReSerVation Protocol) based CAC (call admission control) are available to provide very tight QoS in the absence of failure. However, none of these voice load control approaches perform very well during a network failure. For example, when only capacity planning is used to ensure voice QoS, enough spare capacity needs to be provisioned to ensure that there is no congestion in failure cases. While many networks provision to allow for single element failures, there still may be congestion if multiple failures occur concurrently (or in the case of unexpected traffic load or traffic distribution) unless gross overprovisioning is used.
With RSVP based CAC approaches, in the time interval immediately following network failures, IGP may reroute traffic affected by the failure, before a new admission control decision has been taken. Thus, congestion may occur in this transient period before CAC is performed and some calls are potentially torn down.
A number of techniques, such as MPLS/IP Fast Reroute discussed above, are available to provide very fast recovery in case of failure. However, there are only limited techniques available for protecting QoS over the period during which fast recovery mechanisms are in use. For example, with MPLS Fast Reroute, unless Bandwidth Protection mechanisms are used there may be congestion, which will last until an alternate path is found. If no alternate path is found, the congestion will last indefinitely.
Bandwidth Protection builds on the use of MPLS Fast Reroute by allocating bandwidth to backup tunnels. Bandwidth Protection thus requires a very significant amount of bandwidth to be dedicated to backup to protect all voice traffic in all targeted failure scenarios. Bandwidth Protection attempts to minimize the amount of capacity allocated to FRR backup tunnels by including smart optimizations, such as sharing backup capacity for protection of different failures which are unlikely to happen at the same time. However, this approach still requires that there is enough capacity to support all traffic after the failure, otherwise all traffic flow can get degraded. Bandwidth Protection also cannot cope with unplanned combinations of failures.
A network operator is therefore left with two options to deal with network failures. The first is to allocate a large amount of capacity to make sure QoS of all the targeted traffic can be maintained during any failure scenario. The second is to accept that any flow from the targeted traffic may be degraded during a failure. Both of these options have drawbacks. For example, the first option is very expensive, and the second results in a possible degradation in QoS for all the traffic flow.
There is, therefore, a need for a method and system for mitigating QoS degradation during network failure in different environments.