1. Technical Field
This disclosure relates generally to using a real data set to create 1) a test environment for performing tests; and/or 2) a duplicate application environment for generating reports, billing statements, data analysis, and the like.
2. Background of the Related Art
It is a common and good practice for developers, performance analysis engineers, and quality assurance (QA) engineers to test their product using real life data before handing the product to their customers. Applications that may require testing against a large and real life data set include, without limitation, database applications and storage management software. To perform realistic functional and operational tests, an engineering organization assigns to its developers and testers some amount of storage and server hardware to create test environments that are similar to a production environment. A large and real life data set for testing can be created by restoring data from backup tapes, copying from a snapshot, or copying from a saved master data set into a test environment. Test data can also be generated by an application-specific data generation tool. Once a very large data set is created or obtained, it is a good practice to safeguard it. A saved data set may be compressed to save storage space. A well-managed QA organization typically also ensures that a good master copy of test data does not get overwritten by test software. Given that a master copy of test data should be preserved, a QA engineer typically has to make a copy of it to create a test environment. In this case, if there are 10 QA engineers, there needs to be 10 times the extra storage to permit the creation of 10 testing environments for running all the tests in parallel.
Testing with real life data is also an important part of an IT process before an administrator rolls out a new deployment, updates production application software, or upgrades an IT environment. Creating a test environment and generating a test data set can be very time consuming and costly. IT administrators often have to perform these tasks during off time-periods, or to postpone new rollouts or system upgrades while they are being performed.
There are other scenarios in which production data is desired to be used, e.g., data analysis, data mining, and auditing.
To overcome the known challenges, some vendors have developed data cloning technology to make a clone for a test environment from a read-only snapshot. FIG. 1 illustrates how a point-in-time read-only snapshot is cloned to make a read-writable copy. FIG. 2 illustrates a snapshot clone taken from a backup vault. FIG. 3 illustrates multiple test environments using storage snapshot cloning technology. This latter case necessitates the creation of three snapshot clones (307a, 307b, and 307c) for the three test applications (309a, 309b, and 309c). These technologies are usually extended features of a storage device or a storage archive solution.