Rapid detection and recovery of communication failures represents a useful feature for any network system, but the need for such capability is particularly acute for certain types of networks and certain types of traffic, such as voice traffic and other types of “bearer” traffic. Correspondingly, there are a number of known techniques for testing the “liveness” of communication links.
Some of these techniques particularly apply to Open System Interconnect (OSI) Layer 3 connections over any OSI Layer 2 data link layer or media (e.g., Ethernet, ATM, etc.). Specific examples include the relatively simple Layer 3 Internet Control Message Protocol (ICMP) Echo Request messages, which are sent at regular intervals of time to an adjacent node or router, to more elaborate solutions, such as the Open Shortest Path First (OSPF) routing protocol. However, these solutions do not in a general sense provide fast and reliable monitoring of the end-to-end Layer 3 (L3) connection between local and remote host nodes that are interconnected through a communication network, such as an IP network that includes multiple routing hops between the host nodes. Here, one may note that an end-to-end L3 path may include, and often does include, multiple L3 segments, going from one hop to the next. Here, a “hop” can be either a router or a switch, for example.
In what may be understood as a more robust mechanism for end-to-end monitoring of L3 connectivity, the Internet Engineering Task Force (IETF) developed the “Bidirectional Forwarding Detection” protocol, which is referred to as BFD and is detailed in the Request for Comments (RFC) 5880. Additional RFCs of interest include RFC5881, '5882, '5883, and '5884. RFC5881 relates to single-hop connectivity, while RFC5883 relates to multi-Hop connectivity and defines the usage of the BFD protocol over an IPv4 or an IPv6 network.
As one of its several advantages, the BFD protocol provides a connectivity detection mechanism that can be used for connectivity detection over any media, at any protocol layer, and with a wide range of detection times (as small as 50 ms or less) and overhead control. In an example of BFD-based connectivity monitoring, see the U.S. Patent Publication 2007/0180105 A1 (2 Aug. 2007), which discloses the use of BFD for distinguishing between link and node failures.
More broadly, BFD may be understood as offering low overhead and rapid detection of connection failures. BFD also provides flexibility because it works over any type of Layer 2 media, including Layer 2 media types that do not inherently support strong failure detection, such as Ethernet, virtual circuits, tunnels, and Multi-Protocol Label Switched (MPLS) paths. However, BFD and the other connection failure detection protocols do not in and of themselves provide any mechanism for determining the location of an L3 connection failure.
Such determinations are decidedly non-trivial. The challenges of providing a low-overhead and reliable approach to rapidly identifying failure locations are particularly challenging in network topologies offering multiple connection paths, whether for primary/backup usage or for load-balancing in multi-path routing.