Data communication networks refer to resources which interconnect and provide for communications between various computers, telephones, and other network users that are located at geographically separate locations. These networks can range from small networks interconnecting computers within a building to large distributed networks spanning countries or continents. Data communications networks are typically implemented where a large number of users interact or access common files or databases, as such networks provide the necessary connectivity while sharing communications resources in a cost-effective manner.
Data communications networks are comprised of a group of switches, which are referred to as nodes, network users, who reside at or are electronically connected to the nodes, and a plurality of transmission links which interconnect the nodes. The nodes serve to provide users access to the network, and provide means for routing data transmitted across the network. The transmission links which interconnect these nodes carry the communications signals, and may comprise one or more different types of communications media such as wire, cable, radio, satellite or fiber optic communications links.
In recent years, both the size and number of data communications networks have proliferated. For instance, the recent expansion of the Internet is just one example of the trend toward distributed computing and information sharing. Most of these networks provide for communications between users by transmitting data over one or more network "communication" or "routing" paths, which refer to the sequential series of nodes and transmission links the data traverses in traveling from a source user to the end user. Each transmission link on such a network communications path is typically referred to as a link or "hop," and the communication path often includes multiple links or hops as most networks only provide sparse connectivity (i.e., most nodes are only connected to a small percentage of the nodes in the network). Thus, a communication may be originated by a first user and pass through several links before reaching the recipient user.
The control over these communications is typically carried out under a networking architecture. Many networking architectures exist for defining and managing communications between network users. A prominent example of an existing network communications architecture is the System Network Architecture (SNA) which was initially developed by International Business Machines (IBM) over twenty years ago. SNA comprises a seven layer communications architecture which allows reliable transmission of data between a host computer and other remote computers or users.
As the number of secondary computers or users proliferated, and as many secondary computers became more powerful, many SNA architecture networks became flooded with information and some host computers became overburdened in attempting to manage the links and flow of data to the numerous secondary users. In light of this problem, a need for more advanced architectures developed that would allow multiple computers or users to assume a primary role in network routing, management and control, in order to avoid a single host computer from being overburdened by the task. One of the network architectures which was developed to fill this need is known as Advanced Peer-to-Peer Networking (APPN). APPN allows multiple computers, which are defined as "network nodes," to act as peers in communicating and directing information across a network. In addition to these network nodes, APPN networks also may have "end nodes," which are capable of sending and receiving messages, but which do not perform any of the management and routing functions carried out by the network nodes. Moreover, APPN allows an entire SNA network to present an image as a single APPN network node (with multiple connections), thereby allowing SNA networks (which are now typically known as subarea SNA networks) to exist in an APPN network and to have APPN connectivity.
While APPN has proven to be a reliable networking architecture, increased computer networking requirements have created a demand for network architectures which utilize higher performance computer and communication systems. In part because of these demands, High Performance Routing (HPR), which is an enhancement to APPN, was developed. HPR takes advantage of the advances in processing capability, link technology, lower cost memory, error correction coding and other communications and routing enhancements to provide reliable, high speed data routing and delivery which includes both end-to-end error recovery and end-to-end flow and congestion control, where the flow of data is controlled by the sending and receiving systems. HPR also provides for more efficient routing at intermediate nodes.
As the development of data communications networks and their communications architectures have evolved, a demand for more fault tolerant network designs has emerged as users demand a high availability of communications. To meet this demand, many networks provide redundant parts, data checking and correction and other measures that help avoid failures and minimize communications errors introduced by the network. Additionally, many networks also now provide backup or secondary link definition, which allows for communications over a second link in the event that the primary link fails for some reason. Examples of networks which provide various mechanisms for providing some sort of backup link capability include the networks described in U.S. Pat. Nos. 4,887,290, 5,134,644 and 5,448,723, and U.S. Pat. No. 5,426,773 which describes an SNA network design that includes backup path definition in the event of a failure on a primary communications path.
Additionally, HPR also provides for non-disruptive path switching. When HPR detects a possibly failed connection (i.e., a node or link along the communications path is no longer operating properly), it may perform a switch to a secondary communications path (if one is available) and then resume the transfer of data on the secondary path. Thus, HPR allows for non-disruptive path switching in the case of path failures.
One advantage of flexible network architectures such as HPR is that when error conditions occur the network may be able to reroute data around the failing path to a backup path. Thus, the reliability of the network is maintained. Similarly in a load balanced network which shares communications across multiple concurrent paths, when one of the concurrent paths is disrupted the connections on that path may be rerouted to the other concurrent paths.
One disadvantage of switching to a backup path when a primary path fails is that often backup paths are less economical or of lower performance than a primary path. Thus, for example, a primary path may be over high speed links and the secondary path over lower speed links. When one of the high speed links fails the network connections are rerouted to the backup path, but the throughput of the connection may be reduced considerably. Therefore, performance to the user may be reduced. Similarly, the primary path may utilize low cost leased links, while the backup path often uses high cost switched links. Thus, when a link on the primary path fails and a connection is switched to a backup path, the cost of the connection may be greatly increased. Moreover, in the load balancing context, when connections are transferred from a failed primary path to a secondary or backup path, there is an increase in traffic on the backup path. This increase in traffic may result in a decrease in performance on the shared path.
In light of the above discussion, there exists a need for improvement in the performance of data communications networks after an error condition causes the failure of a network node or link.