Continuation of the rapid growth in Internet and telecommunication usage experienced during the past decade is predicated on corresponding increases in network bandwidth. This places higher and higher demands on the network elements that make up the network, such as switches, routers, etc. The higher demands placed on networks, particularly high-bandwidth backbones, has lead to an increase in the number of network elements for a typical network, which, in turn, requires greater routing intelligence, longer routes (time-wise), and more sophisticated packet processing.
Under a typical communication between two endpoints at different geographic locations, data is encapsulated in the form of packets (e.g., TCP/IP (Transmission Control Protocol over Internet Protocol packets)) or cells (e.g., ATM (Asynchronous Transfer Mode) cells) to be transported via an underlying network protocol or protocol stack. The packets or cells are routed across a “virtual” communication path (route) in view of routing decisions made by the various network elements. Oftentimes, various packets corresponding to the same message are sent along different routes between the communicating endpoints and reassembled at the receiving endpoint to deliver the message.
Since network backbones and the like need to be available at all times, techniques have been developed to enable network elements to be added, removed, and temporarily shut down. This is facilitated, in part, via routing protocols that enable a given network element to be made aware of routes offered by other network elements. For example, “Hello” messages are used to facilitate the Open Shortest Path First (OSPF) and the Border Gateway Protocol (BGP) routing protocols. Hello messages containing routing information are exchanged between peers (e.g., adjacent network elements), and routing tables for each network element are updated in view of the routing information contained in the Hello message.
To support continuous availability, high-use networks, such as backbones, employ redundant network elements. This allows a given “internal” network element to failure or be taken offline without taking down the entire network. However, in some instances, it is not possible or practicable to provide redundant network elements at network ingress and egress points (e.g., at the network border elements).
In response to a network element failing or being taken offline, the peer network elements automatically (in most cases) detect the element is no longer available. Accordingly, corresponding Hello messages are propagated throughout the network to indicate route segments that include the network element are no longer available. This requires an update in the routing tables of the network elements. Also, since the Hello messages are usually transmitted over the same links used for network traffic, a portion of the available network bandwidth is reduced by their use. As might be expected, the greater the redundancy and size of a network, the greater the number of Hello messages used to reflect the change in network configuration, increasing the amount of bandwidth consumed by this non-revenue traffic.
Additionally, the failure or removal of a network element creates a significant problem with respect to message/data delivery. During packet/cell routing, packets and cells are “temporarily stored” as they traverse each network element along a given route. Thus, if a network element goes down, all of the packets/cells that are currently stored on that element will be lost. This generally produces two results. For confirmed delivery protocols, such as TCP/IP, the sender will determine after a time-out period with no confirmation reply that the message was not received by the receiver, and resend the message. This consumes additional network bandwidth, and the delay may be aggravating to the recipient. Worse yet, under unconfirmed delivery protocols, such as UDP (User Datagram Protocol), data corresponding to the lost packets is irretrievably lost. For voice traffic, this situation will either create a gap in the telephone conversion, or drop the call completely.
The failure of a network element also causes packet routes to change (to avoid the failed element), typically loading the network elements employed for the alternate paths, adding delays to the packet delivery. For example, networks are often configured in view of anticipated network traffic patterns, resulting in (ideally) load balancing of the network elements. When one of these elements fails, the proximate elements have to now handle rerouted traffic in addition to the traffic load they were configured for. This produces bottlenecks that reduce the operational bandwidth of the network as a whole.