1. Field of Art
The present invention relates generally to the field of data communications networks, and more particularly to systems and methods employed in data communications networks for determining whether, when and how individual links in the network should be removed from service (torn down). The invention may be employed in or with any number of systems and devices found in a variety of different types of data communications networks, including, for example, the network routers, gateways, servers, intelligent relays and switches that form the infrastructure and backbone of the Internet, of private wide area networks (WANs), and local area networks (LANs), telephone switching systems, wireless data communications networks, optical networks, microwave or satellite communications systems.
2. Related Art
In the data communications industry, various methods and systems have been introduced to determine when an active network link should be torn down. Such tear-downs are necessary in order to remove faulty or failing links from a network so that traffic is not steered across links that are unlikely to be able to successfully deliver the traffic. For example, data communications networks based on Synchronous Optical Network (SONET) protocol typically employ special fields within the SONET frame that allow the routers and switches to determine when a link in the network is failing or is about to fail. In such cases, the routers or switches affected by the failure are normally configured to tear down the failing link and use an alternative path. This technique is sometimes referred to as “automatic fail over.”
In data communications networks based on the Internet Protocol (IP), neighboring routing and switching devices often exchange “hello” messages in order to test the network links that connect them together. A failure to receive or acknowledge a certain percentage of these messages often causes a given router to tear down the network link that failed to deliver the messages. Routers and switches in such IP networks may be configured, for instance, to tear down a network link unless K of the most recent N “hello” messages has been successfully received. Thus, a system might require that at least 9 of the most recent 10 messages must have been received, or at least 3 of the last 5. A router or switch thus monitors the reception rate of the hello messages, and automatically tears down a link when these criteria are not met.
There are significant disadvantages to using such schemes to determine when network links should be torn down. These schemes almost always remove failing network links from service based solely on the status and/or performance of that failing link, without taking into account the status and/or performance of other links or groups of links in the network, or the role the failing network link currently plays in the overall operation and performance of the network as a whole. Unfortunately, tearing down failing network links without taking into account what is happening in the immediate vicinity of the failing link or elsewhere in the network often causes more problems than would have occurred if the failing network link had been left in service.
For example, even though a particular link may be exhibiting signs of current or impending failure (in terms of the link's speed, quality, security level or cost, for instance), the failing link may be the only link in the network connecting two particular nodes or two sub-networks in the overall network. Thus, tearing down that particular link may partition the network into two or more sub-sections that can no longer communicate with each other, which is a situation usually considered to be much worse than continuing to use the failing link. Even if the failing link is not the only remaining link between two network segments, it may still be better to keep the failing link in service if there are relatively few alternative links in service or if the failing link is still playing a valuable role in the overall performance of the network.
Suppose, for example, that in addition to the failing network link, all links in the network (or a substantial number of them) are also underperforming or malfunctioning. This situation often occurs in radio frequency networks during periods of intense external interference from manmade sources or natural phenomena, such as a severe electrical storm. Systems utilizing conventional network link teardown algorithms, which blindly remove links from service based solely on the status or performance of each link will eventually tear down every link in the network experiencing the interference, even though it might be more vastly more desirable to continue using the links, albeit at a lower quality, slower speed, higher cost or lower level of security.
Accordingly, there is a need for systems and methods that consider the status and performance of other links in the network, as well as the role a failing link is playing in overall network performance, when determining whether, when and how failing links should be removed from service. The present invention addresses such a need.