The present invention relates to data telecommunication networks. More particularly, the present invention relates to a network survivability scheme for use in a data telecommunication network.
Traditionally, data telecommunication networks have been designed to carry traffic with xe2x80x9cbest effortxe2x80x9d characteristics. In a system using best efforts, in the event of a failure the system will attempt to reroute data signals, but will discard the data if the attempt at rerouting is not successful. The explosive growth of the Internet and the increasing importance of the information exchanged over it leads to the need for highly reliable data networks. To reliably manage larger quantities of information, superior survivability schemes for managing the flow of data need to be implemented. A survivability scheme provides a network with a procedure for rerouting data being conveyed over the network in the event of a failure in the network.
Routers are devices for managing the flow of data over a network. Currently, routers are responsible for communicating with other routers and choosing between multiple paths when sending data over the network to routers located in other parts of the network. In choosing between multiple paths, a router will select the most efficient path (based on some measurement, e.g., distance, cost, . . . ) between two locations (referred to hereafter as a network node, or simply node) and will automatically reroute data in the event of system failures. Generally, a data network consists of multiple nodes and a transport network. An individual node represents the router hardware and software for directing the data, and a transport network represents the physical paths available to transmit data between nodes. Presently, if a failure occurs in one node, an indication of the failure is communicated to other network nodes so that the routers in these nodes become aware that a failure has occurred and can reroute the affected data appropriately.
One method for providing network reliability is by implementing dual-homing architecture in the nodes and connecting the nodes over a shared protection transport network. An example of this arrangement is depicted in FIG. 1. To facilitate discussion the edge routers 1-2 in node 1 will be referred to as source edge routers and the edge routers 3-4 in node 2 will be referred to as destination edge routers. Each edge router 1-4 can act as a source or destination edge router depending on the direction the data is flowing. In a dualhoming architecture system, the traffic from a source edge router 1 is directed or xe2x80x9chomedxe2x80x9d to two diverse core routers A and B so that the failure of a single core router can be tolerated. This scheme allows core router B to manage the data from edge router 1 if core router A fails and vice versa. If the path between edge router 1 and core router B is assigned as the primary path P and core router B fails, a secondary path S between edge router 1 and core router A could be used.
Likewise, a dual-homing architecture is implemented in node 2. In node 2, destination edge router 4 is homed to core routers C and D. If the path between edge router 4 and core router D is designated as the primary path P and the path between edge router 4 and core router C is designated as the secondary path S, a failure in core router D would have to be detected by edge router 4 so that edge router 4 would use the secondary path data.
Given the dual-homing approach between edge and core routers, the role of the transport network can be to either provide two diverse optical-pipes (primary and secondary) between each pair of core routers or enable sharing of protection capacity in order to recover from any transport network failure (e.g. link failure). FIG. 2 illustrates the network architecture where the transport network is just providing diverse optical pipes. (Note that unshared protection is provided between the pair of source-destination edge routers, implying that the protection-switching function is only required at the edge routers). This architecture can be realized by providing either (1+1) or (1:1) protection of edge-to-edge primary paths. In (1+1) architecture, the traffic is simultaneously fed into both primary and secondary paths. This enables the destination edge router to identify a failure by simply monitoring primary as well as secondary paths. In (1:1) architecture, under the no failure condition, the traffic is only fed into the primary paths, and when a failure occurs the traffic is switched to the secondary path for all the affected primary paths. This enables the system to use the protection capacity (secondary paths) for carrying preemptable traffic under normal conditions, but at the cost of complex signaling mechanisms that will considerably increase the restoration time. A third option is to split the traffic and use two diverse paths in a load sharing mode. Here, there is really no concept of primary and secondary paths. Each path becomes a back-up for the other. Each path is provisioned with enough capacity to handle traffic for both. Like (1:1) architecture, this approach also requires signaling to move the traffic from one path to the other.
Since (1+1) and (1:1) architectures use unshared protection, the amount of protection capacity must be large enough to carry the total network traffic (along secondary paths that are disjoint to the primary paths). If the total network traffic is T, then, from some real network design exercises, we know that additional capacity required for unshared protection is about 2T. This required protection capacity can be reduced if the transport network is able to provide shared protection for any failure in the transport network domain. FIG. 3 illustrates such an architecture. In this FIG., p(1), p(2), and p(3) (depicted by solid lines) represent optical pipes carrying primary traffic between nodes consisting of core routers (A and B) and (C and D), (E and F) and (C and D), and (G and H) and (C and D). The optical pipes reserved for carrying protection traffic (in case of a failure) are depicted by dotted lines. Note that the optical pipe p(4) can be shared for any failure affecting optical pipes p(1), p(2), and p(3), thus requiring less capacity than the unshared protection case. It is well known in the art that for some real networks that, compared to unshared protection, shared protection can save protection capacity on the order of 20% to 40%. Given the significant savings in protection capacity, high cost of long-haul optics, and availability of shared protection capability in today""s transport networks, using shared protection in the transport network is an attractive option. However, it is still necessary to consider the recovery from a router failure in a node when shared protection is used in the transport network.
Like traditional multi-layer survivability schemes, transport network failures can be recovered by the shared protection transport network, and router failures can be recovered by using Internet Protocolxe2x80x94Multi Protocol Level Switching (IP-mpLS). However such nested multi-layer survivability schemes have major drawbacks. For example, they require that transport networks be provisioned for both the primary and protection paths of the nodes. This means that in addition to the transport network failure protection (provided by the transport network), the architecture is providing (1:1) protection for failures in the nodes. Depending on the availability of the network and type of services, the nodes may be required to protect only a fraction of the traffic, which in an extreme case is 100%. Such an architecture simplifies operation and management of the network by recovering node and transport network failures locally but results in substantially more capacity costs in the transport layer.
The present invention discloses a network architecture for combining a dual-homing approach in the nodes with a shared protection transport network. The network architecture removes the need for signaling between nodes over the transport network in the event of a router failure in one of the nodes. The network architecture incorporates select functions into the nodes for managing the data flow between individual edge routers and their connection through core routers to the transport network.
The select function determines whether failures have occurred in core routers within the node containing the select function and provides switching so that a single accurate data path is sent to the transport network. The select function eliminates the need for sending multiple data paths (e.g., a primary data path and a secondary data path to provide backup for the primary data path) and signaling (e.g., to indicate which path should be used) from the node over the transport network to other nodes. The select function allows the individual nodes to recover independently without signaling other nodes over the transport network, thereby eliminating the need for complex signaling mechanisms.