Computer software which is developed for commercial release typically is subjected to multiple rounds of testing during its development. Testing helps the developer verify that the software has the desired functionality, and to identify errors or lack of compatibility with various hardware configurations or other software. The software can then be redesigned and any errors or lack of compatibility corrected while the software is still in development and not yet released for commercial sale.
Robust operating systems, such as Microsoft Corporation's Windows NT operating system, must be capable of tolerating intermittent failures of the underlying network communications connecting two or more computers in a network. Such network failures are common in the typical, complex network topologies of today's corporate environment. Accordingly, from a testing perspective, it's important to verify at the development stage that the operating system correctly tolerates intermittent network failures.
A brief network failure can be simulated by manually unplugging the network wire connection from a computer for some duration, then manually plugging the connection back into the computer. This test methodology, however, is unsuitable for large scale automated testing.
The present invention is a software test tool and method for simulating a network failure under software control. In accordance with the invention, the software test tool and method simulates a network failure by temporarily redirecting calls intended for actual send and receive handlers in a network operating system to "substitute" handler functions. These substitute handler functions operate to intercept data being sent to or received from the network. In one embodiment of the invention, the substitute handler then returns a status datum to its caller indicating successful sending or receiving of the data, but does not actually send or receive the data. This effectively cuts off the sending and receiving of data from the network at the substitute handlers. Since no data can be sent or received, the computer is effectively disconnected from the network, as if the network connection were unplugged. The test tool and method resumes network operation by again directing calls to the actual send and receive handler functions.
In alternative embodiments, the substitute handlers can return a status datum to indicate other conditions, such as a network failure. Further, the substitute handlers in some alternative embodiments can intermittently return a status datum indicative of successful sending or receiving, and another status datum indicative of network failure.
In accordance with a further aspect of the invention, commands are provided for controlling the test tool and method from software. A "suspend network" command causes the test tool and method to simulate network failure. A "resume network" command causes the test tool and method to restore network operation. In some embodiments of the invention, the commands include options for specifying a particular network adapter card and/or network transport protocol which is to be suspended or resumed. Accordingly, through use of these commands, a software procedure can exercise deterministic control over simulated network failures. Further, network failure simulations can be automated for large scale or repetitive testing.
Software controlled simulated network failures according to the invention are beneficial in a number of testing applications, as shown by the following examples. For network domain log in tests, the primary log in server computer can be temporarily "suspended" so the client log in request is handled by a backup log in server rather than the primary server. For file transfer tests, one of the two computers can have its network operation briefly "suspended" and then "resumed" for a duration long enough to cause some network packets to be re-transmitted. For system stress testing, randomized brief network failures cause retry-disconnect-reconnect logic to execute in various system components, exposing logic paths that might otherwise not be executed except by physical failure of the underlying network. There are numerous other uses for this tool to simulate network failures.