Currently, the number of data networks and the volume of traffic these networks carry are increasing at an ever increasing rate. The network devices making up these networks generally consist of specialized hardware designed to move data at very high speeds. Typical asynchronous packet based networks, such as Ethernet or MPLS based networks, are mainly comprised of end stations, hubs, switches, routers, bridges and gateways. A network management system (NMS) is typically employed to provision, administer and maintain the network.
Multiprotocol Label Switching (MPLS) based networks are becoming increasingly popular especially in traffic engineering IP networks. MPLS uses a label switching model to switch data over a Label Switched Path (LSP). The route of an LSP is determined by the network layer routing function or by a centralized entity (e.g., a Network Management System) from the topology of the network, the status of its resources and the demands of the user. Any suitable link state routing protocol may be used such as Open Shortest Path First (OSPF) or Intermediate System to Intermediate System (ISIS) routing protocol to provide the link state topology information needed by the network layer routing to engineer data traffic. Another possibility is to utilize a local neighbor-discovery protocol whereby the global topology is maintained by a centralized management entity. LSPs may be set up using any suitable signaling protocol such as RSVP-TE, CR-LDP or using the management plane (e.g., the NMS setting the relevant MIB items that create the LSPs).
There is increasing demand by users that networks include a mechanism for fast repair of failed links or nodes. Since a LSP traverses a fixed path in the network, its reliability is dependent on the links and nodes along the path. It is common for many networks to provide some form of protection in the event of failure. For example, in the event of a link or node failure, the network can be adapted to switch data traffic around the failed element via a protection route.
The protection of traffic can be accomplished in several ways using the MPLS framework. Two ways that traffic can be protected using MPLS include recovery via LSP rerouting or via MPLS protection switching or rerouting actions.
The two basic models for path recovery include path rerouting and protection switching. Protection switching and rerouting may be used in combination. For example, protection switching provides a quick switchover to a recovery path for rapid restoration of connectivity while slower path rerouting determines a new optimal network configuration at a later time.
In recovery by path rerouting, new paths or path segments are established on demand for restoring traffic after the occurrence of a fault. The new paths may be chosen based upon fault information, network routing policies, pre-defined configurations and network topology information. Thus, upon detecting a fault, paths or path segments to bypass the fault are established using the signaling protocol. Note that reroute mechanisms are inherently slower than protection switching mechanisms, since more processing and configuring must be done following the detection of a fault. The advantage of reroute mechanisms is that they are cheaper since no resources are committed until after the fault occurs and the location of the fault is detected. An additional advantage of reroute mechanisms is that the LSP paths they create are better optimized, and therefore consume less network resources.
Note also that once the network routing algorithms have converged after a fault, it may be preferable, to re-optimize the network by performing a reroute based on the current state of the network and network policies in place.
In contrast to path rerouting, protection switching recovery mechanisms pre-establish a recovery path or path segment, based on network routing policies and the restoration requirements of the traffic on the working path. Preferably, the recovery path is link and node disjoint with the working path. When a fault is detected, the protected traffic is switched over to the recovery path(s) and restored.
The resources (i.e. bandwidth, buffers, processing, etc.) on the recovery path may be used to carry either a copy of the working path traffic or extra traffic that is displaced when a protection switch occurs leading to two subtypes of protection switching. In the first, known as 1+1 protection, the resources (bandwidth, buffers, processing capacity) on the recovery path are fully reserved, and carry the same traffic as the working path. Selection between the traffic on the working and recovery paths is made at the path merge LSR (PML).
In the second, known as 1:1 protection, the resources (if any) allocated on the recovery path are fully available to low priority or excess information rate (EIR) traffic except when the recovery path is in use due to a fault on the working path. In other words, in 1:1 protection, the protected traffic normally travels only on the working path, and is switched to the recovery path only when the working path has a fault. Once the protection switch is initiated, the low priority or EIR traffic being carried on the recovery path is displaced by the protected traffic. This method affords a way to make efficient use of the recovery path resources.
An example of protection switching in MPLS networks is described below. Consider an example MPLS based network incorporating a bypass tunnel. The network comprises a plurality of label switched routers (LSRs) connected by links. Backup tunnels are established for protecting LSPs statically by the management station or using RSVP signaling. RSVP extensions for setting up protection tunnels have been defined. To meet the needs of real-time applications such as video on demand, voice over IP, etc., it is desirable to affect the repair of LSPs within tens of milliseconds. Protection switching can provide such repair times.
The LSPs can also be protected (i.e. backed up) using the label stacking capabilities of MPLS. Instead of creating a separate LSP for every backed-up LSP, a single LSP is created which serves to backup a set of tunnels. Such a tunnel is termed a bypass tunnel. The bypass tunnel itself is established just like any other LSP-based tunnel. The bypass tunnel must intersect the original tunnel(s) somewhere downstream of the point of local repair. Note that this implies that the set of tunnels being backed up all pass through a common downstream node. Candidates for this set of tunnels include all tunnels that pass through the point of local repair and through this common node which do not use the facilities being bypassed.
To repair the backed up tunnels, packets belonging to a failed tunnel are redirected onto the bypass tunnel. An additional label representing the bypass tunnel is stacked onto the redirected packets. At the last LSR of the bypass tunnel, the label for the bypass tunnel is popped off the stack, revealing the label that represents the tunnel being backed up. An alternative approach is to pop the bypass-tunnel label at the penultimate LSR of the bypass tunnel.
As described above, two types of protection include end-to-end protection and local protection. The former provides an alternative backup path in the event a failure occurs along the primary path. The latter provides protection at the core wherein each link is protected by a backup protection tunnel. In the event of a link failure, local protection quickly restores traffic through the bypass protection tunnel. No efficient mechanism exists, however, to inform the edge devices that a bypass protection tunnel is in use. This is important since the edge nodes are the only nodes that implement end-to-end protection for a path. Once notified, the edge nodes may divert traffic to a backup path. In addition, such a mechanism enables the end-nodes of a tunneling-service to report the fact that the service goes through a protection tunnel.
There is thus a need for a notification mechanism capable of informing the edge nodes on the ends of a path that local protection has been deployed somewhere along the path. The notification mechanism should be scaleable and simple to implement without demanding large computing resources from intermediate nodes.