The reliability of computer based applications continues to be an important consideration. Moreover, the distribution of applications across multiple computers, connected by a network, only complicates overall system reliability issues. One critical concern is the reliability of the network connecting the multiple computers. Accordingly, fault-tolerant networks have emerged as a solution to insure computer connection reliability.
In many applications, the connection between a single computer and a network is a critical point of failure. That is, often a computer is connected to a network by a single physical connection. Thus, if that connection were to break, all connectivity to and from the particular computer would be lost. Multiple connections from a single computer to a network have therefore been implemented, but not without problems.
Turning to FIG. 1, a diagram of a computer 11 connected to a network 21 is shown. Computer 11 includes a network interface, for example, a fast-Ethernet interface 13. A connection 30 links fast-Ethernet interface 13 with a fault-tolerant transceiver 15. Fault tolerant transceiver 15 establishes a connection between connection 30 and one of two connections 29 and 31 to respective fast-Ethernet switches 19 and 17 (these "switches" as used herein are SNMP managed network Switches). Switches 17 and 19 are connected in a fault-tolerant matter to network 21 through connections 23 and 25.
Fault-tolerant transceiver 15 may be purchased from a number of vendors including, for example, a Digi brand, model MIL-240TX redundant port selector; while fast-Ethernet switches 17 and 19 may also be purchased from a number of vendors and may include, for example, a Cisco brand, model 5000 series fast-Ethernet switch.
Operationally, traffic normally passes from fast-Ethernet interface 13 through fault-tolerant transceiver 15, and over a primary connection 29 or 31 to respective switch 17 or 19 and on to network 21. The other of connections 29 and 31 remains inactive. Network 21 and switches 17 and 19 maintain routing information that directs traffic bound for computer 11 through the above-described primary route.
In the event of a network connection failure, fault-tolerant transceiver 15 will switch traffic to the other of connection 29 and 31. For example, if the primary connection was 31, and connection 31 broke, fault-tolerant transceiver 15 would switch traffic to connection 29.
When, for example, traffic from computer 11 begins passing over its new, backup connection 29 through switch 19, network routing has to be reconstructed such that traffic bound for computer 11 is routed by the network to the port on switch 19 that connection 29 is attached to. Previously, the routing directed this traffic to the port on switch 17 that connection 31 was attached to.
Several problems arise from the above-described operation. First, the rebuilding of network routing to accommodate passing traffic over the back-up connection may take an extended period of time. This time may range from seconds to minutes, depending upon factors including network equipment design and where the fault occurs. Second, fault-tolerant transceiver 15 is only sensitive to a loss of the physical receive signal on the wire pair from the switches (e.g., 17 and 19) to the transceivers. It is not sensitive to a break in the separate wire pair from the transceiver to the switch. Also, it is sensitive only to the signal from the switch to which it is directly attached and does not test the backup link for latent failures which would prevent a successful recovery. This technique also fails to test the switches themselves.
Another example of a previous technique for connecting a computer 11 to a network 21 is shown in FIG. 2. Network switches 17 and 19 and their connection to each other and network 21 is similar to that shown in FIG. 1. However, in this configuration, each of switches (e.g., 17 and 19) connects to its own fast-Ethernet interface (e.g., 13 and 14) within computer 11.
Operationally, only one of interfaces 13 and 14 is maintained active at any time. When physical signal is lost to the active interface, use of the interface with the failed connection is ceased, and connectivity begins through the other, backup interface. The backup interface assumes the addressing of the primary interface and begins communications. Unfortunately, this technique shares the same deficiencies with that depicted in FIG. 1. Rerouting can take an extended period of time, and the only failure mode that may be detected is that of a hard, physical connection failure from the switch to the transceiver.
The present invention is directed toward solutions to the above-identified problems.