Testing of software and hardware is a daunting undertaking. A simple program to add only two integer inputs of 32-bits (yielding 264 distinct test cases) would take hundreds of years, even if tests were performed at a rate of thousands per second. Given the complexity of modern software, testing every permutation of all of the inputs and all of the functions of a software program is not possible. However, not testing software and hardware, particularly when our lives, finances and security depend on the proper operation of this technology, is unthinkable (or should be).
The challenge of testing software is made even more complex in today's vastly distributed networks like the Internet. Identifying and remedying problems that arise through unexpected interactions with third party software and hardware is extremely difficult.
The challenge is to find a means to exercise software and hardware systems (collectively, sometimes referred to as a “system”) to minimize the risk that the system will break down, produce erroneous data, or permit unauthorized access to data when operating in a real-world environment, or that the system cannot be expanded to handle increasing traffic.
Performance, capacity, and stress testing are all closely related, and load tests can be used to perform all three. The difference is in which test metrics are being evaluated. Performance testing determines the response times a user can expect from a system as the system is subjected to increasing load. Capacity testing determines how many concurrent and total users a system can handle with a predetermined acceptable level of performance. Capacity planning determines how many concurrent and total users a system needs to be able to support to meet the business objectives of system operator. For example, the business objective might be, “The server needs to be able to support 400 concurrent users with response times under five seconds, and the database needs to support one million total users.” Capacity planning also involves determining what hardware and software is required to meet those business objectives.
Scalability is an important consideration in capacity planning because business objectives may change, requiring that the system handle more traffic. Naturally, an operator will want to add increased capacity to its site at the lowest cost possible.
Capacity testing is used to validate that a solution meets the business objectives determined during capacity planning. Stress testing is designed to determine the maximum load a system can sustain while generating fewer errors (e.g., timeouts) than a predetermined acceptable rate. Stress and stability testing also examines and tries to determine the maximum load a Web site can sustain without crashing.
Since the events of Sep. 11, 2001, additional test protocols have been developed to evaluate systems with respect to information assurance, authentication and controlled access. Addressing these security concerns may impact the performance of a system.
System testing today comes in several forms. Hardware manufacturers provide test beds to test software on branded servers, but the test bed does not facilitate scalability or other capacity testing. Third party testing tools allow system developers to perform testing in a “micro-environment” but do not provide means to test in a real-world environment.
A load testing system has been described as having multiple load testing servers that are configured to apply a load to a target web site, or other target server system, remotely over the Internet. The described system provides no ability for instrumentation in a closed area and no ability to bring testing to break point. For additional details, refer to U.S. Pat. No. 6,477,483 and U.S. Pat. No. 6,560,564 to Scarlatt et al.
A system has been described that uses an altered form of client cache which purports to enable more realistic and representative client requests to be issued during the testing process. The described system does not teach an infrastructure with full instrumentation that can be tested to failure, nor does it teach a method for replicating the complexity of large-scale server deployment of applications on multiple servers in large distributed environments like Internet or LAN. For additional details, refer to U.S. Pat. No. 6,418,544 to Nesbitt et al.
A structure has been described for generating packet streams that are configurable to simulate non-consecutive network traffic (e.g., Internet traffic). For additional details, refer to published patent application US 2003-0012141 by Gerrevink.
A method and system has been described for simulating multiple concurrent clients on a network server to stress test the server. For additional details, refer to U.S. Pat. No. 6,324,492 to Rowe.
A system has been described for test communications network performance utilizing a test scenario simulating actual communications traffic on the network to be tested. Performance data may be monitored at one of the endpoint nodes of each endpoint node pair and reported to the console node either as it is generated or after completion of the test. For additional details, refer to U.S. Pat. No. 6,408,335 to Schwaller et al.
Methods and systems have been described for testing stateful network communications devices are disclosed. According to one test method, stateful and simulated stateless sessions are established with a device under test. Packets are sent to the device under test over the stateful and stateless connections. Information received on the stateful connections is used to alter test conditions on the stateless connections. As a result, a realistic mix of network traffic can be achieved with a reduced amount of hardware. For additional details, refer to published patent application US 2003-0088664 by Hannel et al.
A system and method have been described for simulating a plurality of TCP connections directed toward an Internet site under test. The method includes retrieving information from the TCP connection and recording statistics related to the information. For additional details, refer to U.S. Pat. No. 6,295,557 to Foss et al.
A system and method have been described for accelerated reliability testing of computer system software components over prolonged periods of time. The system and method provide for tracking the reliability of system components and logs failures of varying severity that may be expected to occur over time. This data is useful, among other things, for estimating mean time between failures for software being tested and expected support costs. This information is particularly useful in providing a reliability measure where multiple independently developed software modules are expected to function together. The testing includes random scheduling of tasks and sleep intervals reflecting expected usage patterns, but at a faster pace to efficiently sample the state space to detect sequence of operations that are likely to result in failures in actual use. For additional details, refer to U.S. Pat. No. 6,557,120 to Nicholson et al.
A graphical user interface has been described as contained on a computer screen and used for determining the vulnerability posture of a network. A system design window displays network items of a network map that are representative of different network elements contained within the network. The respective network icons are linked together in an arrangement corresponding to how network elements are interconnected within the network. Selected portions of the network map turn a different color indicative of a vulnerability that has been established for that portion of the network after a vulnerability posture of the network has been established. For additional details, refer to U.S. Pat. No. 6,535,227 to Fox et al.
A computer-implemented method has been described for rules-driven multi-phase network vulnerability assessment. The method comprises pinging devices on a network to discover devices with a connection to the network. Port scans are performed on the discovered devices and banners are collected. Information from the collected banners is stored as entries in a first database. Analysis is performed on the entries by comparing the entries with a rule set to determine potential vulnerabilities. The results of the analysis are stored in a second database. For additional details, refer to U.S. Pat. No. 6,324,656 to Gleichauf et al.
These described concepts are directed to a variety of problems associated with testing hardware and software systems. However, collectively they do not teach operating a closed testing environment that can faithfully duplicate (as opposed to emulate) the operating environment of the system under test (SUT). While certain data may be logged (errors, response times, etc), the testing environment is not instrumented to permit a fully diagnostic view of the response of the SUT to simulated input. Further, the SUT is not tested to failure in conjunction with the instrumented environment to determine failure modes, recovery modes, and failure avoidance.
What is needed is a system and method for testing hardware and software systems in a fully instrumented environment that accurately duplicates the operating environment of the SUT and that can test the SUT to failure.