Today, many large and medium sized companies have to manage vast amounts of data for an extended period of time. In one example, a company may maintain benefits data (insurance numbers, participants, social security numbers, account numbers, investments, portfolio value, etc.) for thousands of employees due to legal requirements or contractual obligations. In other examples, a different company may maintain purchasing and order data for millions of customers going back several years. Thus, for companies storing such voluminous data, it is necessary to operate and manage multi-terabyte databases.
These databases require great amounts of physical storage. The physical storage costs great sums of money to acquire, maintain, and operate. In some instances, a test database is also required to verify the function of different changes to the database or for other reasons. The test databases are a copy of at least a portion of the main database, also referred to as the production database. The test databases are also very costly to create because the test databases often require the same equipment and resources. Compounding the expense, companies generally need multiple copies of the main production database to test functionality of applications, performance of application programs, or upgrades to recent releases, for example. The several copies of the databases further increase the need for physical space to house the equipment and increase the associated costs. In extreme cases, the production databases are simply too large to obtain enough physical space to replicate and test the databases.
In general, companies try to replicate only a portion of the production database to minimize the costs associated with the test database. However, these test databases having a cross-section of the production database do not reasonably approximate the characteristics of a very large production database. For good test results, the performance characteristics are maintained for various application programs. For example, to test a benefits application, the test database needs to retain the historical distribution of benefits data for employees going back multiple enrollment cycles.
It is in light of these and other considerations that the present application is being presented.