A portion of the disclosure of this patent document contains command formats and other computer language listings, all of which are subject to copyright protection. The copyright owner, EMC Corporation, has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.
This invention relates generally to systems and methods for replaying workload data produced in a data storage environment, and more particularly to a system and method that may access trace data of workload activity produced in a data storage system and then replaying the trace data for testing or other reasons.
Testing the workload environment of a data storage environment including at least one data storage system and at least one software application operating on a host computer in communication with the data storage system is a complex task. It often requires that the business have a separate test-bed that contains a duplicate set of hardware where such tests take place. Large companies such as telecommunications companies, airlines, banks, and insurance companies routinely populate a test lab with a large amount of equipment including software applications for emulating production conditions. Other companies rely on vendors providing systems and software to run tests for them but sometimes the various vendors are unable to replicate the myriad of configurations that a particular customer may encounter within their own data storage environment.
The actual execution of application load-tests requires that a copy of the production database(s) be loaded on the storage systems and that a workload driver be created to generate either batch jobs or transactions that attempt to duplicate the production workload. Setup times and the analysis of the test results make such an effort extremely complex and limits such activities to only very few businesses that can afford the time and personnel costs.
The complexity of such a task often reduces these tests to various levels of simplicity where the results do not reflect the actual application. Furthermore, it becomes even more complicated to experiment with alternative configurations and map them onto the production system. Add to this the common requirement to see the effect of multiple applications on the same storage system and the problem is even further compounded.
Data Storage owners who try to shortcut this effort often resort to general-purpose Input/Output (I/O) drivers that are available in the marketplace. Such drivers do not attempt to duplicate an existing workload. They simply provide the user with the ability to specify a specific stream of I/Os to specific data volumes or logical devices.
It would be an advancement in the computer, and particularly the data storage arts to have a solution that could duplicate a workload in a data storage environment but would reduce the complexity of existing systems. Further, if such a solution significantly increased the accuracy and flexibility of such tests that would also be a significant advantage over prior art techniques.
To overcome the problems of the prior art and to provide advantages described above, this invention is a system and method for having a workload scenario operating in a data storage environment.
The method includes accessing a trace of workload activity experienced on one or more data storage volumes included with a first data storage system and playing a replication of the trace of workload data on one or more data storage volumes included with a second data storage system. The first and second system can be the same system, i.e., the workload activity is replayed on the same system on which it was captured. Preferably the workload activity is accessed in the form of I/O activity.
In another embodiment, a system is provided that is configured for performing the steps of accessing a trace of workload activity experienced on one or more data storage volumes included with a first data storage system and playing a replication of the trace of workload data on one or more data storage volumes included with a second data storage system.
In another embodiment, a program product is provided that is configured for performing the steps of accessing a trace of workload activity experienced on one or more data storage volumes included with a first data storage system and playing a replication of the trace of workload data on one or more data storage volumes included with a second data storage system.