A routing protocol defines a process by which network devices, referred to as routers in packet-switched networks, communicate with each other to disseminate information that allows the routers to select routes between any two nodes on a computer network. One type of routing protocol, referred to as a link state protocol, allows routers to exchange and accumulate link state information, i.e., information describing the various links within the network. With a typical link state routing protocol, the routers exchange information related to available interfaces, metrics and other variables associated with network links. This allows a router to construct its own topology or map of the network. Some examples of link state protocols include the Open Shortest Path First (OSPF) protocol and the Intermediate-System to Intermediate System (IS-IS) protocol.
The connection between two devices on a network is generally referred to as a link. Connections between devices of different autonomous systems are referred to as external links while connections between devices within the same autonomous system are referred to as internal links. Many conventional computer networks, including the Internet, are designed to dynamically reroute data packets in the event an individual link fails. Upon failure of a link, the routers transmit new connectivity information to neighboring devices, allowing each device to update its local routing table. Links can fail for any number of reasons, such as failure of the physical infrastructure between the devices, or failure of the devices interfacing with the link.
When a link or router in the network fails, routers using traditional link state protocols such as OSPF and IS-IS may take a long time to adapt their forwarding tables in response to the topological change resulting from node and link failures in the network. The process of adapting the forwarding tables is known as convergence. This time delay occurs because recovery from a failure requires each node to re-compute the shortest path algorithm to calculate the next hop for the affected nodes in the network. Until the next hops are re-computed, traffic being sent toward the failed links may be dropped. Current deployments take time in the order of 500 milliseconds to several seconds for detection and recovery from failures in the network. These large convergence times may adversely affect the performance of Voice over Internet Protocol (VoIP) and multimedia applications, which are extremely sensitive to traffic loss. Service providers are demanding end-to-end failure detection and recovery times to be less than 50 milliseconds.
One approach to reduce failure recovery time is to select an alternate next-hop in addition to the best next-hop for a destination. Along with the best next-hop, the alternate next-hop is installed in the packet forwarding component. When a link failure occurs, the router uses the alternate next-hop for packet forwarding until the shortest path algorithm has re-computed the next hops for the updated network topology and installed the re-computed next hops in the packet forwarding component. A basic remote LFA repair mechanism can be used, in which tunnels are used to provide additional logical links which can then be used as loop free alternates where none exist in the original topology.