A telecommunications network comprises a plurality of nodes connected together by means of for example optical fibers. If any of the fibers are cut, the traffic through a portion of the network is disrupted. To remedy this disruption, automatic protection switching is ordinarily provided to move disrupted traffic to dedicated spare circuits promptly, typically in less than 50 milliseconds. But this automatic protection switching requires a high dedicated spare channel capacity.
To avoid automatic protection switching in restoring disrupted traffic, there are some conventional methods that deal directly with each node of the network. Each node consists generally of one or more are digital cross-connect switches. One of these restoration methods is a distributed restoration algorithm (DRA), a variant of which is often referred to as the self-healing network (SHN) method. SHN is an algorithm which rims independently in each of the nodes of a mesh network which have spans connecting adjacent nodes. In order to provide a fast and reliable method of restoring traffic affected by fiber cuts and other failures using SHN, the intelligence is distributed in the network and is based specifically on signalling between nodes. Such signalling may be referred to as the use of signatures and/or messages. Signatures are dynamic in that they are sent on a continuous basis into the nodes, and more specifically into the digital cross-connect hardware in each node where the state of the signature represents a given logic that changes from frame to frame. Upon receipt of these signatures, SHN would react. There are several signature types in SHN and these signatures may be in band or out of band, as they can actually ride with the traffic as part of the payload itself or can superpose on the overhead as part of the supervisory overhead of a signal such as for example in a SONET network.
Each node running SHN conveys the signalling between itself and its adjacent node. Thus, a signature is sent from node to node and each node in turn realizes the logical span(s) to which it is attached. As designed, a logical span contains a multiple number of links and connects two given nodes. A link in turn is a communications channel of any size, for example a DS3 or a DS1 standard communications signal.
A SHN null signature is exchanged between the nodes and carries the information necessary for a node to identify its neighbor(s). Thus, each node knows exactly which other node(s) it is connected to and has an identification of the logical span connecting it to those nodes. The null signatures are sent continuously and thus continuously update each node of its neighbor's presence for a given link. The status, in terms of the functioning of each link, is therefore reinforced continuously.
Whether a link in the network is a spare link or a working link is provisioned by the management of the network which in turn knows where the spare links and the working links of the network are. The spare links in essence are dedicated links that provide a capacity for restoration and which otherwise do not carry any traffic. These spare links are used only when the system detects a failure of a given link, or a set of links, in a logical span by some means, such as for example the detection of a lost signal or a maintenance signal by a line terminating element (LTE) in the cross connect switch of a node.
Once a link is determined to be in alarm, a node, more specifically the cross connect switch in the node, will retrieve the stored information received from the null signatures in an arbitration process to determine what node is connected on the other side of the failed link. In other words, the adjacent nodes sandwiching the failed link each have a node ID. From the arbitration process, one of the nodes is determined to be a sender and the other a chooser. These two adjacent nodes, the sender and the chooser, make up a custodial pair of nodes. A number of different arbitration methods may be used, such as one where numbers (node IDs) are assigned to the different nodes with the node having the lower node ID becoming the sender while the node with the higher node ID becoming the chooser.
Having decided on a custodial pair of nodes, a flooding process for searching alternate routes, or alt-routes, is next started by the sender. Alt-routes are spare routes that traffic disrupted by the failed link may be directed to reach the chooser.
The flooding process in SHN sends signatures into spare links. These flooding signatures or flooding messages carry certain information such as the sender node ID, the chooser node ID and an index. The index is simply a unique number that represents a given flooding route, for example a given flooding demand that will allow the logic in the nodes downstream from the broadcasting node to determine whether two signatures are different or are the result of a multi-cast of the same signature, or whether each signature represents a unique flooding pattern. Nodes downstream from the sender, which may be referred to as tandem nodes, will receive the signatures and detect a state change on the spare paths. The signatures will be multicast out to a particular spare link of each logical span terminates at a tandem node. The spanning out of these signatures throughout the network will reach the chooser eventually through one or more of the tandem nodes if one or more all routes exist.
A chooser recognizes that a flooding signature is meant for it by looking for its own node ID at the chooser node ID field. The chooser then responds to a given unique sender/chooser index combination by sending a complementary restoration signal, or a reverse linking message or signature. This signature travels back through the same path, or the same alt-route, to the sender to inform the sender that it has indeed reserved an alt-route for that particular demand and that the chooser is awaiting the arrival of the restored traffic. Alt-routes may be chosen by a chooser based on any arbitration method since a given sender/chooser index may arrive from several different logical spans, or links within a logical span. Typically, the shortest path, based on a number of hops or repeat counts measured or detected by the chooser node, is chosen.
Regardless of the method of arbitration a chooser uses, it will reserve the particular span that it has reversed linked onto and will typically ignore and discard any other signatures arriving from any other links with that same sender/chooser index. Incoming ports, or precursor ports, and outgoing ports of the tandem nodes along the alt-route are reserved by the reversed linking message so that a particular path through the matrix of each cross connect switch of a tandem node is mapped. The reverse linking signatures are transmitted back to the sender to let the sender know that an alt-route has been reserved by the chooser and the tandem nodes along the alt-route. Any priority function of choosing which traffic to connect onto the alt-route is performed at this time.
A second method for restoring traffic due to failed fiber cuts is a centralized restoration scheme. This second method depends on a centralized intelligence that has a built in knowledge of the different nodes and links of the network, and a defined solution for a particular failed connection. There are advantages and disadvantages to this centralized restoration scheme. One of the disadvantages is that the topology of the network has to be stored in a centralized database and has to be updated every time a change occurs in the network. Thus, the implementation of the centralized restoration scheme becomes more difficult and extensive when the network changes very rapidly, such as for example when links are added, removed, failed, or changes in the network. Such changes occur when spare links are changed to working links or vice versa. In any event, the important aspect of a centralized restoration scheme is reliance on a central intelligence and the ability to dynamically update this intelligence so that it knows the proper state of the network at the time of a failure.
On the other hand, the DRA mentioned previously distributes the intelligence among the different nodes of the network so that the intelligence of the network is gained from the signals and messages from the adjacent nodes and/or the nodes downstream from where the failure occurred. Thus, in the DRA scheme, a far away or far end node can use the network itself and the current state of the network as it is updated so that it knows exactly on which logical span a traffic is to be routed and how many links are on each logical span and what kind of links are on it. This means that the switching commands and the ability to find and reserve alt-routes, as well as the ability to switch the proper traffic into and out of those alt-routes, are all performed in each node individually using a distributed intelligence with a distributed set of rules, logic and mathematical algorithm. Thus, the DRA has the advantage over the centralized restoration scheme of being faster on a network wide level.
There are two different forms of distributed restoration algorithm. One is the span or link based distributed restoration scheme in which SHN is an example of where attempts are made to find alt-route between two custodial nodes. The second form of DRA is a path based scheme in which the shortest and/or most reasonable end-to-end alt-route throughout the network is to be found. The jointed or connected set of links throughout the network of a given circuit forms the alt-route for this scheme. These conventional link based schemes theoretically treat the failure of a single logical span between two nodes, irrespective of the number of links within that span, as a single fiber cut.
However, in actuality, when a fiber cut does occur as, for example, results when trench digging equipment cuts a cable, a number of cuts actually may occur, because a single cable contains many fibers. In addition, all of the cuts may not occur at the same time. Because of the differences in time in which the different links of a span, or the different spans within a cable, are cut, a "greedy characteristic" is introduced into a real life SHN scheme which heretofore was not accounted for. In essence, the greedy characteristic results from the sender of the first cut circuit performing a so-called preemptive activated flooding in which it floods its restoration signals to a sufficient number of the available spare links of the network so as to reserve alt-routes for each link of the spare. This preemptive activated flooding is used by the failed circuit to more quickly find alt-routes in the event of a partial failure of some links of a given span. Thus, it floods as if all of the links of a given span do eventually fail. In other words, with the first cut in a span, a preemptive activated flooding is performed to reserve sufficient spare capacity so that the traffic through most, if not all, of the links of the span can be restored.
But such preemptive activated flooding in fact can locate and reserve an excessive number of alt-routes, i.e., finding more alt-routes than it actually needs for a logical span. The sender of this cut circuit then would hold onto the spare circuits indefinitely until its restoration process is terminated.
Thus, if more than one logical span of the network is cut, or different circuits such as links at different spans are cut at approximately the same time, multiple sender/chooser pairs will become active. A race condition would ensue among the different senders of the cut circuits to contend for the spare capacity and the alt-routes. Accordingly, there is no fair contention process between the multiple senders, as the sender that first initiates flooding and having its reverse linking done before the others have a chance to flood and reserve any alt-routes could in actuality reserve nearly all of the alt-routes in the network for itself. It could therefore prevent other cuts from being repaired expeditiously.
There is therefore a need to provide a method, and a system therefor, of resolving the contention among multiple senders, and their respective demands, for the spare capacity of the network for restoring disrupted traffic after network failures.