One of the very basic aspects of telecommunications networks is their availability and reliability. Hence, the operation of such a network requires a fast fix of failures by some mechanism. In local area networks (LAN) e.g., in a building or a campus, it might be sufficient to have personnel and a spare pool of equipment as due to the vicinity of the installation repairs or replacements can quickly be done.
Simply because of the geographical dimensions this is not possible or prohibitive in terms of cost in transmission networks like metropolitan or wide area networks (MANs and WANs, respectively). Hence, for these networks the network itself or the combination of network and network management needs to provide the means and facilities to ensure sufficient availability. Typically, these network mechanisms are distinguished in protection and restoration.
Protection mechanisms known from transmission systems like SDH systems (Synchronous Digital Hierarchy) require a 100% spare capacity of resources for protection in the network and provide the means for a very fast masking of the failure in terms of availability, typically in less than 50 ms.
Restoration mechanisms are more stringent in the usage of spare capacity and however, providing a masking of the failure at a lower speed, typically in the range of a few seconds as completely new paths through the network are established.
An example of restoration is shown in FIG. 2. In case of a failure of one link or node, the affected traffic is re-routed over some other links which have reserved capacity for restoration purpose. The process of re-routing the traffic affected by this failure is called path restoration as new paths are established to replace the failed ones. The time needed to restore a network distinguishes between fast restoration if all paths are restored in about or less than 10 seconds and slow restoration if all paths are restored within a few minutes. If restoration is a functionality provided by the network management it is called centralized restoration, if restoration is a functionality provided by the network itself (like for protection) it is called distributed restoration.
These mechanisms are applicable to basically any network structure—ring, mesh or hub structures or combinations thereof. However, some mechanisms are more suitable to some structure than others—basically the planning requires the specific network at hand to define the optimal configurations.
A basic problem in large networks is to determine where and how much spare capacity needs to be reserved for restoration purpose in order to ensure that any single failure in the network can be fully restored. For most of the today's networks, this has been done manually, by making assumptions and simulating any failures to see whether full restorability is achieved. However, this test has to be performed any time topological changes are made in the network, e.g., by adding or replacing single links or nodes.