The instant invention relates to a telecommunications network provisioned with a distributed restoration scheme or algorithm (DTNR), and more particularly to the execution of the distributed restoration process in the distributed restoration algorithm provisioned telecommunications network in response to a simulated failure.
In a telecommunications network having a plurality of interconnected nodes and provisioned with a distributed restoration algorithm or scheme, when a failure is detected anywhere in the network, the various nodes would automatically determine and implement the switching actions that can circumvent the failure. In the case of a failure of a network path, the various nodes or switches of the network would communicate with each other to ascertain the available capacity that may be used for restoral. In most of the distributed restoration algorithm (DRA) schemes, the nodes have no prior knowledge of the topology of the network.
When a designer of a telecommunications network contemplates which of the many DRA approaches to apply to a given network, it is generally necessary for the designer to perform extensive simulations to gauge the adequacy and performance of the different approaches. Such simulations are particularly important for identifying anomalous behavior of the network that otherwise is difficult to anticipate because of the dynamic multi-processing that takes place in such a network.
Although a simulation is useful for estimating the functionality of a DRA scheme prior to its implementation in the network, there are some aspects that remain best to be measured in a xe2x80x9clivexe2x80x9d or operational network. Such aspects include the actual speed and behavior of the network during the restoration process. Further aspects that make simulations not as accurate include action propagation delays, topological changes, different software versions and unanticipated conditions that cause the simulations to differ from actual empirical findings.
There is therefore a need for means to superimpose a DRA simulation process within a xe2x80x9clivexe2x80x9d traffic bearing network without actually performing any of the switching that would disrupt traffic. In other words, the DRA algorithm needs to be exercised as if there is an actual failure, so that the performance of the network in a failure scenario can be more accurately measured. Moreover, in order not to tie up the network in the event that an actual failure does occur during the restoration exercise, an actual distributed restoration process needs to take over in the event of an actual failure.
To practice the instant invention restoration process in an operational DRA telecommunications network, an exercise information message is provided to one of the nodes of the network. This exercise information message contains information or data relating to a simulated failure and missing in-band information that would have been exchanged over recovered links during a real event. Some of the data included in the failure information include the origin-destination pair of the nodes, the failed paths and the recovered links, if any. Also included in the exercise information message is data relating to potential staggered cuts between the origin-destination pair.
Upon receipt of the exercise information message, a distributed restoration process begins. But instead of utilizing and exchanging the various messages that are required for flooding the network and locating the spare or reusable capacity for rerouting the traffic, a set of structurally similar exercise messages are used. These messages include an exerciser failure notification message, an exerciser explore message, an exerciser return message, an exerciser connect message, and an exerciser step completed message. Given that the structure of the various exerciser messages are the same as messages that are used in the event of a real failure in the network, the distributed restoration process would operate as if an actual failure has occurred. Thus, the network reacts by restoring what it perceives to be a disruption of the traffic by the simulated failure.
To ensure that no actual cross-connections take place, the exerciser messages are configured to enable the network to execute the distributed restoration process only up to the point where cross connections are to be made. In the event that an actual failure does occur during the exercise restoration process, the network is provided with a preemption functionality that takes the network out of the exercise restoration process, once an actual fault is detected, so that an actual distributed restoration process can take place. To ensure that the exercise restoration process does not tie up the network for an indefinite period, a drop-dead timer is provided to each of the nodes of the network so that the execution of the exercise restoration process ends with the expiration of the timer, irrespective of whether or not the exercise restoration process has been completed.
As the exercise restoration process is taking place, the reaction of the various nodes of the network, as well as the network itself, are measured and collected. The thus collected data are then provided to the designer of the network, or the management of the network as a feedback on how the network would react to certain failures. A better design of the restoration process of the network could therefore be effected.
It is therefore an objective of the present invention to execute a distributed restoration process in a xe2x80x9clivexe2x80x9d traffic bearing telecommunications network without actually performing any switching or cross connections that could disrupt traffic using the same DRA scheme that would be employed in the event of an actual failure.
It is moreover an objective to the present invention to collect accurate measurements of the reactions of the network in response to a failure, without actually provoking such failure that ends up disrupting the traffic within the network.
It is still another objective of the present invention to provide a simulation restoration process that immediately yields to any actual failure event that occurs during the simulation restoration process.
It is yet still another objective of the present invention to provide a simulation restoration process that is capable of performing restoration in a particular sequence in response to multiple failures.