The present application relates to data networking and more particularly to systems and methods for rerouting around failed links and/or nodes.
The Internet and IP networks in general have become key enablers to a broad range of business, government, and personal activities. More and more, the Internet being relied upon as a general information appliance, business communication tool, entertainment source, and as a substitute for traditional telephone networks and broadcast media. As the Internet expands its role, users become more and more dependent on uninterrupted access.
To assure rapid recovery in the event of failure of a network link or node, so-called “Fast Reroute” techniques have been developed. In a network employing Fast Reroute, traffic flowing through a failed link or node is rerouted through one or more preconfigured backup tunnels. Redirection of the impacted traffic occurs very quickly to minimize impact on the user experience, typically in tens of milliseconds.
These Fast Reroute techniques have been developed in the context of MPLS Traffic Engineering where traffic flows through label switched paths (LSPs). Typically, the overall network is configured such that traffic flows through guaranteed bandwidth end-to-end “primary” LSPs. It is also possible to establish short primary LSPs in a non-Traffic Engineering network, only for the purpose of taking advantage of Fast Reroute techniques (see above-referenced patent application entitled “MPLS Reroute Without Full Mesh Traffic Engineering.”)
In either case, when a link or node failure occurs, traffic affected by the failure is rerouted to the preconfigured backup tunnels. These backup tunnels are used only for a very short time since simultaneously with the rerouting through the backup tunnels, the head ends of all affected primary LSPs are notified of the failure. This causes the head ends to reroute the primary LSPs around the failures so that the backup tunnels are no longer needed. It is generally assumed that the probability of multiple failures in such a short time is small, so each failure may be considered independently.
Under the independent failure assumption, link bandwidth available for backup tunnels may be shared between backup tunnels protecting different links or nodes. The techniques disclosed in U.S. patent application Ser. No. 10/038,259 make use of this assumption to allow available backup bandwidth to be shared among links or nodes to be protected while assuring that guaranteed bandwidth requirements continue to be met during Fast Reroute conditions. On the other hand, without taking advantage of the independent failure assumption, it is very difficult to assure guaranteed bandwidth during failure recovery while using bandwidth resources efficiently.
Mechanisms currently available for failure detection do not always allow the failure of a link to be distinguished from failure of a node. For example, a network node may lose communication via a particular link without knowing whether only the link itself has failed or the node to which the link has connected has failed. This ambiguity can cause the network to attempt to reroute around simultaneous failures when in fact only a single failure has occurred. The combined backup bandwidth requirements of simultaneous failures may exceed available backup bandwidth on some links leading to a violation of bandwidth guarantees and possible user perception of deteriorated service.
In theory it would be possible to correct this ambiguity by centrally determining backup tunnels such that no such clash is possible. However, placing this constraint on backup tunnel placement leads to less efficient use of available bandwidth. Furthermore, computing the correct placement of backup tunnels would also become far more complex and computation-intensive.
Furthermore, it is more desirable to compute backup tunnels in a distributed fashion rather than centrally. If backup tunnel computation is to be done in a distributed fashion across the network, the task is made practically impossible due to the need to signal a large amount of backup tunnel information among nodes. If link failures could be distinguished from node failures, the validity of the independent failure assumption would be strengthened, allowing backup tunnels to be computed in a distributed fashion and readily signaled with zero bandwidth in accordance with the techniques disclosed in U.S. patent application Ser. No. 10/038,259 without compromise to bandwidth guarantees.
What is needed are systems and methods for determining whether a link or a neighboring node to which the link connects has failed.