Testing the workload environment of a data storage environment including at least one data storage system and at least one software application operating on a host computer in communication with the data storage system is a complex task. It often requires that the business have a separate test-bed that contains a duplicate set of hardware where such tests take place. Large companies such as telecommunications companies, airlines, banks, and insurance companies routinely populate a test lab with a large amount of equipment including software applications for emulating production conditions. Other companies rely on vendors providing systems and software to run tests for them but sometimes the various vendors are unable to replicate the myriad of configurations that a particular customer may encounter within their own data storage environment.
The actual execution of application load-tests requires that a copy of the production database(s) be loaded on the storage systems and that a workload driver be created to generate either batch jobs or transactions that attempt to duplicate the production workload. Setup times and the analysis of the test results make such an effort extremely complex and limits such activities to only very few businesses that can afford the time and personnel costs.
The complexity of such a task often reduces these tests to various levels of simplicity where the results do not reflect the actual application. Furthermore, it becomes even more complicated to experiment with alternative configurations and map them onto the production system. Add to this the common requirement to see the effect of multiple applications on the same storage system and the problem is even further compounded.
Data Storage owners who try to shortcut this effort often resort to general-purpose Input/Output (I/O) drivers that are available in the marketplace. Such drivers do not attempt to duplicate an existing workload. They simply provide the user with the ability to specify a specific stream of I/Os to specific data volumes or logical devices.
It would be an advancement in the computer arts, and particularly the data storage arts to have a solution that could duplicate a workload in a data storage environment but would reduce the complexity of existing systems. Further, if such a solution significantly increased the accuracy and flexibility of such tests that would also be a significant advantage over prior art techniques.
One area wherein duplicated workloads are useful is that of benchmark testing. But prior art benchmarking approach in storage industry has been running static (i.e., canned), idealized, uniform IO workloads. However, in many cases these benchmarks have no bearing to the actual environment on which benchmark results are desired. It would be an advancement in the arts to provide an invention with a new methodology for benchmarking storage by replaying exact IO trace of customer traces in different storage hardware and software platforms. It would be a further advancement if such a solution could customize the benchmark workload based on customers' real production workload.
It would also be an advancement in the computer arts if an invention having the advantages above was also capable of being used comparing alternative algorithms from a performance perspective. It would also be advantageous if such an invention could be used for consolidation and capacity planning, i.e. allowing engineers to size new implementations with workload data collected from existing storage implementations.
Further it would be advantageous to have an invention that could be used for problem recreation and troubleshooting by recreating the problem workload and carrying out various “what-if” scenarios.