This invention relates to efficient management and storage of data objects within databases. Database applications typically evolve over time. Changes are made to support new features, and old features are deleted. Changes to a database application typically require changing database objects. For example, database tables can be changed by adding or deleting columns, views can be modified to support changes in the table shape, and even whole tables can be added or deleted. The changes that occur to an application's database tables over time are typically referred to as database table schema evolution.
Database table schema evolution can cause problems in a data archiving tool because repeated runs of an archiving specification use the same set of tables over time. In the course of setting up an initial archive run, a specification is created that records the column information for all the tables that are to be archived. This set of tables is typically known as an “archive unit.” The recorded column information is needed for validation checks on the initial archiving run and on all later archiving runs. The validation checks ensure that the proper number, name, and column types, as well as other information, exist in the source tables prior to each archiving run.
One of the key purposes of a data archiving tool is to use the tool to preserve information over a long period of time. As a result, it is likely that the schemas of the tables will require changes, as other applications that use those tables change over time, thereby requiring that the tables evolve.
Various data archiving tools, such as the DB2 Data Archive Expert tool by International Business Machines of Armonk, N.Y., permit users to add new columns to existing source tables, but do not permit any other changes to be made to the source tables. Adding the capability in data archiving tools for users to rename columns in the source tables, either independently or in conjunction with adding further columns to the source tables, can lead to various problems unless special considerations are made. For example, the addition or renaming of columns could cause a naming collision if the detected changes are applied serially to the corresponding target archive tables.
One such case is illustrated in FIGS. 1A-F. Assume, for example, that a user starts with a three-column table, TAB1, as illustrated in FIG. 1A. On the first run, the selected source data from TAB1 is archived to a target archive table, ARCHTAB1. ARCHTAB1 has the same three columns as TAB1, plus additional columns that store information specific to the archiving tool (for example, archive timestamp and sequence information). Over time, the following updates are applied to the table in the order shown:    1. COL2 is renamed to COL2X, resulting in the table shown in FIG. 1B.    2. COL3 is renamed to COL3X, resulting in the table shown in FIG. 1C.    3. COL1 is renamed to COL2, resulting in the table shown in FIG. 1D.    4. COL2X is renamed to COL1, resulting in the table shown in FIG. 1E.    5. A new column COL3 is added to the table, resulting in the table shown in FIG. 1F.
If at this point, the user were to run the archive specification again, the data would either be archived to the wrong columns (assuming only name is considered and that the types of all three columns are the same), or the archive run would fail. Thus, there is a need for improved schema evolution.