1. Field of the Invention
This invention relates generally to copying data tables in a relational data base system and, more particularly, to producing, manipulating, and maintaining user defined snapshot copies of data tables in a relational data base.
2. Description of the Related Art
A relational data base comprises a collection of tables containing dam. A relational data base manager permits users and applications software to create, change, or destroy the data in the tables, the relationships between the tables, and even the structure of the tables. A query language, such as SQL, is used to manipulate the data base and retrieve query results. Many relational data bases are transaction-oriented, meaning that a data table change comprises a transaction that involves two or more related activities and that the transactions are treated in an "all-or-nothing" manner. That is, the activities comprising a transaction are followed by a commit command and none of the activities comprising a transaction is actually implemented unless all are implemented by a commit command. For example, in a transaction-oriented banking application, the first activity might comprise a debit to a table and a second activity might comprise a credit to another table, followed by an SQL commit command that completes the transaction by implementing the first two activities to thereby change the affected tables.
Through the relational data base management system, users can request copies of selected portions of the tables in a data base. These copies are created, for example, by the data base management system in response to a user SQL query that defines a selected data base portion that is of interest. These copies are often referred to as "snapshots" of the data base, because they reflect the condition of the data base at a particular point in time. Snapshots can be used to create additional copies of data tables for concurrent use by multiple users. Snapshots are generally read-only tables that are created out of the original user data base tables and cannot be changed. It can be necessary to specify many different attributes of a snapshot to enable the data base management system to extract the desired data records from the tables.
When requesting a snapshot, a user must specify a source table, from which the selected data records will be extracted, and also must specify a target table, to which the data records will be copied. It can become burdensome for a user to specify details of the copying operation. For example, to copy a table, a computer system would generally need to know where in the computer storage medium the beginning byte address of the table is located. Users, however, usually do not have such information. Users tend to think of their data tables in terms of table names. A similar situation holds for the target table. It also can be burdensome for a user to specify the structure of a copy if the user wants to make a relatively sophisticated extraction. For example, a user might want to request an aggregation of data or summary of data records that requires combining rows or columns of data tables, either by numerical addition, set intersection operators, or other complex manipulations. It would be necessary for the user to specify where the data records can be found, how they should be combined, what structure the resulting data records will have, and where to put the extracted, or copied, data records. Specifying a desired copy of a snapshot, that is, a snapshot of a snapshot, becomes even more complicated.
Another attribute of snapshots that might have to be specified relates to how changes to the source user data tables will be propagated to the snapshots. After a snapshot has been produced, it can be refreshed to reflect all changes to the source user data table since the creation of the initial snapshot or the time of the last refresh operation. For example, the original SQL query might be applied to the updated user data table, producing an updated, refreshed snapshot. It can be burdensome just to specify the frequency of refreshing a snapshot, but it can be much more so if the data base management system applies what is commonly known as differential refreshing. In differential refreshing, the amount of copying necessary to refresh a snapshot is reduced.
For example, rather than returning to the source user tables to replace an initial snapshot with a complete, updated copy of the selected portions of user tables, a data base management system can refresh an initial snapshot by propagating update information received since the initialization of the snapshot or the last prior update. Such information is generally obtained from an activity log, where all change operations to the source tables are recorded. It should be apparent that specifying the details of how to carry out such updating for a snapshot can be a daunting task for a user and may not even be possible, depending on the tables to which the user is given access. A copy algorithm would be necessary to find the proper place in the activity log to begin applying updates to source tables and these updates would need to be serialized between complete refresh operations. In requesting a snapshot copy, a user would like to be free from specifying the details of how such copying should take place and specifying the attributes of the copies. That is, the less of the copy algorithm the user must provide, the easier it will be for the user to obtain desired snapshot copies.
From the discussion above, it should be apparent that there is a need for a relational data base management system that permits users to more easily specify copy operations without being burdened with specifying details of structure, copy refresh algorithm, and the like. The present invention satisfies this need.