One of the advantages of a ring-based network is that the traffic between two nodes on the ring can be re-routed over a predetermined secondary route, if a failure should occur in a primary route. An example of such a network is a SONET ring, with predefined primary and secondary, or working and protection, routes between the nodes on the ring. The routes may be over redundant rings, which pass traffic simultaneously in opposite directions. Such a system is commonly referred to as a xe2x80x9cunidirectional ring.xe2x80x9d
When a failure or a significant degradation in, for example, the primary route, is detected on a SONET ring, the system must automatically re-route, or switch, affected traffic from the primary route to the secondary route. The re-routing, which is commonly referred to as xe2x80x9cprotection switching,xe2x80x9d is performed in unidirectional systems by the destination nodes, that is, by the nodes that terminate the traffic or route the traffic off of the ring to a user or another network. In the example, the destination nodes switch from receiving the affected traffic over the primary route to receiving the traffic over the secondary route. The specifications for many SONET ring based networks require that protection switching be accomplished in 50 msec or less.
Recently, SONET rings have been incorporated into ATM systems. In these systems, ATM cells and frames are routed over the ring in virtual circuits. The virtual circuits that span the same nodes are bundled into virtual paths, and the rings are thus referred to as xe2x80x9cvirtual path rings.xe2x80x9d The virtual paths in these virtual path rings are pre-established, in the sense that each virtual path is assigned a fixed amount of the ring bandwidth based, for example, on the service contracts of the associated users. A virtual circuit is included in a virtual path by allocating to it a fixed portion of the path bandwidth. If there is not enough available bandwidth within the virtual path for the virtual circuit, the system refrains from setting up the circuit, even if there is bandwidth otherwise available on the ring. Accordingly, the virtual path rings may not be able to accommodate bursty traffic, such as the traffic from a local area network, or xe2x80x9cLAN.xe2x80x9d For a more detailed explanation, refer to Bellcore Standard GR-2837.
In known prior systems, the virtual circuits are set up over the selected ring, which in the example is the primary ring. Each destination node includes a primary ring interface that is configured to forward the cells received over the virtual circuits to destination ports in accordance with stored routing information. The destination node discards the corresponding traffic received over the non-selected ring, since the virtual circuits are not terminated on the interface associated with that ring.
Intermediate nodes manage traffic on a virtual path basis, without reference to virtual circuit routing information. The intermediate nodes thus pass the virtual path traffic received on the primary ring to successive nodes on the primary ring and the virtual path traffic received on the secondary ring to successive nodes on the secondary ring, regardless of which ring is ultimately chosen as the selected route by the destination node.
A decision to perform protection switching on the virtual path ring is made on the basis of the virtual paths, and necessarily affects all of the virtual circuits included in the switched virtual paths. The destination nodes implement the switch by individually tearing down the hundreds or thousands of affected virtual circuits set-up on the interface to the previously selected route, and setting up new virtual circuits on the interface to the previously non-selected route. If the SONET ring is not unidirectional, the source nodes must also switch from sending the affected virtual circuit traffic over the previously selected ring to sending the traffic over the previously non-selected ring. The source nodes must thus similarly tear down the hundreds or thousands of virtual circuits from one ring interface and set up new ones on the interface to the other ring.
If the number of affected virtual circuits is relatively large, a given destination node may not satisfy the 50 msec time limit for protection switching. The protection switching time limit may thus limit the number of virtual circuits that can be included in a given virtual path. What is needed is a mechanism that rapidly performs protection switching regardless of the number of affected virtual circuits.
There are essentially two events that trigger protection switching, namely, the failure of a selected path or the degradation of the path. When the path fails, traffic is no longer provided to the destination node over the path. When the path is degraded, the traffic over the path becomes corrupted.
The SONET ring specifications require that intermediate nodes detect a path failure by determining when a node interface on the path is no longer operational. See, Bellcore Standard GR-2980. The intermediate nodes then send appropriate OAM cells to the affected destination nodes, to notify them of the path failure. If the ring is not unidirectional, the intermediate nodes also send OAM cells to the affected source nodes. In response to the OAM cells, the affected destination nodes and, as appropriate, the source nodes, perform protection switching.
The failure detection mechanism works well when the path failure is caused by a failure in the transmission medium, such as a broken cable. It does not, however, work well when the path failure is the result of a node no longer forwarding certain cells over the selected ring because of, for example, an erroneous routing table configuration or a hardware problem. The failure to forward the cells must instead be detected by the intended destination node, when it no longer receives cells. Such failures are not readily detected in traffic that is bursty, such as LAN traffic, because the time intervals between the cells vary in length. Accordingly, there may be a relatively long delay, and thus, a loss of cells, before the destination node detects the path failure. What is needed is a mechanism for the nodes to detect path failure without significant delay, even in bursty traffic.
The SONET ring specifications also require that nodes notify xe2x80x9cupstreamxe2x80x9d nodes of degradation in the path, that is, degradation in the transmission facilities on the path. See, Bellcore Standard GR-2980. The specifications do not, however, state how the nodes determine that a path has become sufficiently degraded, that is, how the nodes determine an accumulated error rate for the path.
Errors in the cell body are detected when the cell is processed at the destination node in an ATM segmentation and reassembly layer, which is removed from the SONET transport layer at which the protection switching decisions are made. There is thus a delay in providing error information from the ATM segmentation layer to the SONET transport layer. Also, cells with damaged headers are discarded along the route and the destination nodes are not typically notified of this type of error.
A system may use OAM cells for monitoring virtual path performance, as set forth in ITUI.610(3/93)B-ISDN Operations and Principles. The performance monitoring OAM cells are injected at periodic intervals into the cell streams on the virtual paths. Each performance monitoring OAM cell includes a count of the cells transmitted after the transmission of the previous associated performance monitoring OAM cell, and summary cell parity information for the counted cells. The count and parity information in the performance monitoring OAM cells are used to determine the number of cells lost over the primary and secondary paths and also associated error bit rates. If the cell losses over the primary path far exceed the cell losses over the secondary path, or there are substantially higher error bit rates associated with primary path then with the secondary path, a protection switch may be initiated.
In order to use the performance monitoring OAM cells, the nodes must perform real time processing of the aggregate cell stream, at line rates. This adds complexity and cost to the system. Further, the performance monitoring OAM cells do not necessarily provide information from which a node can detect path failure with bursty traffic on a timely basis, since the performance monitoring OAM cells are not generated for a given virtual path when there is no cell traffic on that path.
In the known prior systems, the error information from which the nodes determine path degradation may be thus both delayed and incomplete. Accordingly, the nodes may be delayed in making a determination that a path is degraded and, in turn, in making a decision to perform protection switching. The delayed protection switching decision may result in a loss of cells. What is needed is a mechanism for providing to the nodes, in a timely manner, error information from which the nodes can detect path degradation without significant delay.
A system employing the inventive rapid ring protection switching sets up corresponding virtual circuits over both the selected ring and the non-selected ring. A destination node maintains a primary set of routing tables that contains the routing information for every virtual circuit over the primary ring. The node also maintains a secondary set of routing tables that contains the routing information for every virtual circuit over the secondary ring. The node then essentially disables the appropriate entries in the routing tables for the non-selected route, such that traffic received over that route is ultimately discarded. The corresponding entries in the routing table for the selected route remain enabled, and the node uses the information contained therein to route the traffic off of the ring.
More specifically, each set of routing tables includes a virtual path index (VPI) table and an associated virtual circuit index (VCI) table. Each entry in the primary and secondary VCI tables includes a VCI key, with the same VCI key assigned to all of the virtual circuits included in a given virtual path. Each entry in the primary and secondary VPI tables includes a VPI key that is set to match the associated VCI key or not match the VCI key depending on which ring is the selected route for a given virtual path. If the primary ring is the selected route for the given vertical path, the VPI key in the primary VPI table entry for the virtual path is set to the same value as the VCI key. The VPI key in the corresponding entry in the secondary VPI table is set to a value that does not match the corresponding VCI key, to disable the entry.
When a detection node receives a cell over, for example, the primary ring, the node enters the primary set of tables and retrieves the routing information contained in the locations addressed using VPI and VCI values included in the cell. Before the node uses the routing information, the node determines if the retrieved VPI and VCI keys match. If the keys match, the node uses the routing information. Otherwise, the node discards the cell. Similarly, when the node receives a cell over the secondary ring, the node enters the secondary set of tables and retrieves the routing information contained therein. The node then determines if the retrieved VPI and VCI keys match, and if so uses the routing information.
To switch a virtual path, and thus all of the virtual circuits included therein, from the selected route to the nonselected route, the node alters the VPI keys in the appropriate entries in the primary and the secondary VPI tables. In the example, the node alters the VPI key in the secondary VPI table entry to match the associated VCI key, and alters the VPI key in the corresponding entry in the primary VPI table to no longer match the VCI key. The node then uses the routing information for the previously non-selected route to direct the affected traffic off of the ring. The switching for all of the affected virtual circuits is thus simultaneously accomplished by altering the appropriate VPI keys. This is in contract to the re-configuring of the hundreds or thousands of virtual circuits, as is required in known prior systems.
To detect an event that triggers protection switching, such as path failure, the system 10 uses xe2x80x9cContinuity OAM cellsxe2x80x9d to provide path status information to the nodes. A source node periodically originates Continuity OAM cells on the primary and the secondary routes to a given destination node. A node that receives a Continuity OAM cell multicasts the cell to (a) the processor on the node and (b) successive nodes on the same ring, without alteration. Accordingly, the Continuity OAM cells travel around each of the rings at predetermined intervals.
If the node processor determines that the node has not received a predetermined number of Continuity OAM cells from a given source node over the selected route within a predetermined time window, the node determines that there is a path failure on the selected route. The affected destination node then triggers protection switching if, in the same window, the node receives an appropriate number of Continuity OAM cells from the source node over the non-selected route. The delay in deciding to trigger protection switching is thus the length of the time window, regardless of the bursty nature of the traffic on the route.
To facilitate detection of path degradation, each source node may include an error count for the associated ring interface in the Continuity OAM cells. The source node thus includes in the Continuity OAM cells that it sends, for example, over the primary ring, an error count associated with its incoming interface on the primary ring. Similarly, the source node includes in the Continuity OAM cells that it sends over the secondary ring an error count associated with the incoming interface on the secondary ring. Each successive node keeps track of the error counts from each of the other nodes and, based on ring topology information, determines accumulated error rates for the routes over the primary and the secondary rings. A destination node detects path degradation when the accumulated error rate over the selected route from the source node exceeds a predetermined maximum rate for a selected time interval. The destination node triggers protection switching if during the same time interval the accumulated error rate over the non-selected route is better than that over the selected route by a predetermined margin.
The continuity OAM cells used in the current system differ from the periodically transmitted OAM cells used in conventional ATM systems to ensure that a connection does not time-out and disconnect because of a lack of traffic. For example, the conventional OAM cells, which are discussed in ITU-T1.610 (3/93) B-ISDN Operations and Principles, do not provide status information about selected and non-selected routes. Further, these OAM cells are not transmitted when there is traffic over the connection.
The current rapid ring protection system, with its mechanism for detecting path failure and/or path degradation, allows the destination nodes to trigger protection switching without the delay incurred in known prior systems. The current switching mechanism then allows the nodes to switch selected and non-selected routes simultaneously for any number of virtual circuits, and ensures that protection switching time limits are met.