The approaches described in this section could be pursued, but are not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated herein, the approaches described in this section are not prior art to the claims in this application and are not admitted to be prior art by inclusion in this section.
In packet-switched networks consisting of multiple network elements such as routers and switches, the Resource Reservation Protocol (“RSVP”) may be used to reserve routing paths for the purpose of providing optimized routing of specified kinds of network traffic, such as voice traffic. RSVP is described in Braden et al., “Resource ReSerVation Protocol (RSVP)—Version 1, Functional Specification,” Request for Comments (RFC) 2205 of the Internet Engineering Task Force (IETF), September 1997. In general, RSVP can be used to reserve resources in order to achieve a desired quality of service (QoS) for a particular kind of traffic. Resource reservations established using RSVP messages expire over time unless the reservations are refreshed.
RSVP defines sessions and flow descriptors. A session encompasses a data flow that is identified by its destination. A flow descriptor is a resource reservation that encompasses a filter specification (filterspec) and a flow specification (flowspec). When the reservation is implemented at a router, packets that pass a filter defined by the filterspec are treated as defined by the flowspec. RSVP operation is controlled using Path messages and Resv (reservation) messages.
In general, RSVP operation at a host such as a router proceeds as follows. A sender issues a Path message; a receiver receives the message, which identifies the sender. As a result, the receiver acquires reverse path information and may start sending Resv messages to reserve resources at intermediate hosts. The Resv messages propagate through the internet and arrive at the sender. The sender then starts sending data packets and the receiver receives them.
Extensions to RSVP for various purposes are described in RFC 3209, RFC 3471, RFC 3473, and RFC 3477, full citations of which are: “Extensions to RSVP for LSP Tunnels”, D. Awduche, et al, RFC 3209, December 2001; “Generalized Multi-Protocol Label Switching (GMPLS) Signaling Functional Description”, RFC 3471, L. Berger, et al., January 2003; “Generalized Multi-Protocol Label Switching (GMPLS) Signaling Resource ReserVation Protocol-Traffic Engineering (RSVP-TE) Extensions”, RFC 3473, L. Berger, et al, January 2003; “Signaling Unnumbered Links in Resource ReSerVation Protocol—Traffic Engineering (RSVP-TE)”, RFC 3477, K. Kompella, Y. Rekhter, January 2003. Throughout this document, familiarity with all the foregoing references is assumed.
In one respect, these extensions enable RSVP to interoperate with processes that implement Multi-Protocol Label Switching (MPLS). MPLS provides a way to establish expedited routing paths in networks. A management station can instruct a router that all packets bearing a specified label and arriving on a particular ingress port should be immediately routed to a particular egress port with a specified egress label applied to the packet upon egress. In this way, labeled packets bypass normal route processing decisions and move more rapidly through the network. Such expedited treatment is beneficial for network traffic that is sensitive to routing delays or latency, such as voice traffic.
Typically, a software process implements RSVP in a router or switch. However, the RSVP process, or other elements of the router or switch, may fail periodically, and therefore there is a need for a way to restart the RSVP process and re-establish knowledge of paths and labels.
RFC 3209, section 5, describes a “Hello” extension to RSVP that enables one RSVP node to detect that a neighboring RSVP node has failed, by sending a “Hello” message and awaiting an acknowledgment. RFC 3473, section 9, describes an RSVP restart procedure that may be termed “graceful restart.” The graceful restart procedure enables a network element that has experienced a failure in the MPLS control plane, but that has preserved information in the MPLS forwarding plane (a “nodal fault” in the terminology of RFC 3473), to re-create state information necessary for subsequent processing of RSVP messages and labeled packets.
According to RFC 3473, a network node recreates its state information based on replayed RSVP messages received from neighboring nodes, and also based on retrieving information from its own forwarding plane. Using this procedure, nodes acting as endpoints of MPLS label switched paths can retrieve port and label information. Nodes acting as midpoints in MPLS paths can obtain cross-connect information that defines which egress port and egress label is applied to packets arriving at a specified ingress port with a specified ingress label.
Under certain circumstances, the RSVP processes of multiple nodes in a network may fail concurrently. Such multiple failures may arise when a first node fails and a second node is unable to obtain needed information and therefore fails. Alternatively, a hardware fault or software fault unrelated to RSVP or MPLS may result in cascading failures of RSVP processes on several nodes. Thus, there is a need for a way to properly handle the restart of multiple RSVP processes on different nodes.
The graceful restart procedure of RFC 3473 is adequate for some cases of multiple node restart, but not all. For purposes of illustrating the deficiencies of RFC 3473, FIG. 1 is a simplified block diagram of a hypothetical network. Network 100 comprises network elements R1, R2, R3, which are communicatively coupled by links L1, L2. The network elements R1, R2, R3 comprise routers or switches, and links L1, L2 are any form of communication link. Assume that R1 has sent PATH messages that establish a label switched path spanning L1, R2, L2, R3.
In this arrangement, assume that RSVP processes on R2 and R3 have failed. Assume further that the RSVP process of R2 restarts and sends Hello packets to its neighbors, as provided in RFC 3209 and RFC 3473. If R3 restarts after the Hello packets are sent, then R2 is able to determine that R3 has restarted. To re-establish the label switched path described above, R1 then sends a PATH message that includes a Recovery label. R2 sends the PATH message on to R3, and also includes a Recovery label. Accordingly, R1, R2, and R3 properly re-establish the path, and in particular, R2 and R3 can correctly recover all RSVP state information that they had before restarting.
However, if R2 restarts and then R3 restarts before R2 has sent Hello packets, then RFC 3473 provides no way for R2 to detect that R3 has restarted and is in Recovery mode. Instead, in this scenario R2 can send a Hello packet, which R3 acknowledges properly because R3 has completed restarting. Therefore R2 assumes that R3 never failed at all, and there is no way defined in the protocol or its extensions to determine that an intervening failure occurred. Accordingly, when R2 receives a PATH message from R1 with a Recovery Label, R2 will send a PATH message on to R3 that includes a Suggested Label rather than a Recovery Label, because R2 is unaware that R3 previously received the same path. Including a Suggested Label is incorrect, because upon receiving the PATH message, R3 will find a duplicate label in its forwarding plane information and will be unable to accept the Suggested Label.
The RFCs defining RSVP permit establishing new label switched paths only if they do not collide with label switched paths that are in the process of being recovered. Therefore, R3 will either reject the PATH message, or accept the PATH message but use an upstream label different than the Suggested Label. In either case, R3 fails to re-establish the state it had before failure.
Thus, there is a need for an improved process for multiple RSVP nodes to recover from failure.