Bugs and other failures to handle normal and exceptional conditions during execution of computer software can result in substantial harm to the software owner or provider, including financial losses, damage to property, and even personal injury, depending on the failure and the type of software.
In order to identify bugs in software and hardware prior to the general release or use by the software developer or publisher, programmers often use one or more computer testing systems to identify and fix errors. Such computer testing systems can include software running on the system under test, software running on another remote computer system that may be dedicated to managing computer tests, or some combination of these. Such computer testing systems are often used to reveal bugs and other failures during the development process, before the software is put in production.
Computer systems that are tested by such computer testing systems include not only individual personal computers, but also network servers, such as wide area network servers (such as Internet Web servers), database servers, and file servers. Information providers, such as search engine providers, often need to scale their operations so that they are able to service high rates of requests without sacrificing reliability. One way of doing this is to incorporate multiple servers into a networked system. A collection of servers such as this is sometime referred to as a server “farm” or “cluster.” Typically, in such a farm or cluster, multiple individual servers operate to render services and responses. As with other computer systems, software and hardware used in such server farms and clusters are typically tested prior to general release or use.
The tests performed on computer systems can include stress testing, long-haul testing, and combinations of these, in addition to other types of testing. Stress testing intentionally puts a system under excessive load, typically by submitting a high rate of workloads to the system under test, while possibly denying the system resources to process the workloads. The system under test may not ultimately process the workloads, but the system is expected to fail gracefully, without corruption or loss of data. On the other hand, long-haul testing typically tries to approximate average or typical usage of a system under test with enough resources to satisfy workloads, repeated over a long period. While satisfying an individual action or operation might take seconds or minutes, long-haul testing usually is designed as a long running set of operations that are typically performed over days or weeks, making sure that the system under test remains operable throughout the test period. Long-haul testing is often able to reveal bugs in a system under test that would not have been apparent from stress testing or other types of computer testing that are done over short periods of time. Such bugs may include resource leak bugs, timing bugs, hardware-related bugs, and counter-overflow bugs.
In long-haul testing, an experienced programmer often chooses a workload level that the programmer would expect to be placed on the system under test during actual use of the system. For example, the programmer may expect that a certain server system would typically serve about ten client machines having about one-hundred users, and that each user would execute a certain number of operations involving the server system. Thus, programmers configure the testing system to subject the server system under test to that level of machines, users, and operations during the long-haul test.