Testing is an activity that is often performed to check the quality and functionality of software and hardware computing systems. Testing can be performed to accomplish any number of desired objectives. For example, testing may be performed to provide an objective view of whether a component or system under test meets its design and performance requirements, has stable performance, performs its activities within required timeframes, provides appropriate responses, and/or achieves acceptable performance results.
All software testing employs a strategy to select tests that are feasible for the available time and resources to apply to the system under test. A common approach is to create a set of static tests embedded with a pre-defined test cases that seek to explore the functionality of the system for those test cases. This approach is advantageous in that it provides a very targeted approach for checking on a specific set of conditions or outcomes that may be faced by the system being tested. However, what results from this process is typically a set of tests that limits the testing only to the fixed conditions imparted into the static tests. The static nature of these tests is therefore a very significant limitation upon the ability of the test to broadly cover a range of test conditions and circumstances. This limitation of static testing may be addressed by creating even more static tests that each address other test conditions, but this approach creates scaling problems as the number of static tests may quickly need to increase to very large numbers to reasonably cover an acceptable range of conditions and circumstances.
These problems are further exacerbated by the fact that modern computing systems may include many components and subsystems that are distributed across multiple networked nodes or computing clusters, and testing for this type of distributed system will necessarily involve processing in many configuration and operational variations that occur in complicated ways across the distributed system. For example, consider a distributed computing system that performs disaster recovery (DR) operations. Such DR operations involves activities to periodically back up data from a primary computing site to a secondary computing site, detect the presence of failures or problems, perform failover from the primary site to the secondary site upon detection of certain problems, and to perform failback once the primary site has sufficiently recovered. Proper testing of these types of DR operations will require the consideration of numerous system conditions, error conditions, workload conditions, and network/hardware/software configuration conditions that makes it virtually impossible to adequately perform testing in any sort of efficient, scalable, or comprehensive way with static tests.
Therefore, what is needed is an improved approach to implement testing for computing environments.