A snapshot is a body of data constructed of data from a "master" table. The master table may be local or remote relative to the snapshot. The data contained within a snapshot is defined by a query that references the master table and optionally other tables, views or snapshots. A snapshot can be refreshed on a periodic basis to reflect the current state of its corresponding base tables. An "updatable snapshot" is a snapshot to which updates may be directly made. Such updates are propagated from the snapshot back to the master table.
One method of refreshing snapshots is to reissue the defining query for the snapshot and simply replace the previous snapshot data with the results of the reissued query. This method is referred to as a "complete refresh." Complete refreshes are particularly disadvantageous when the master and snapshot tables are located at different sites in a network, because data for the entire refreshed snapshot table must be sent through the network.
Alternatively, another method known as a "fast refresh" can be performed to expedite the refresh operation, by transferring to the snapshot only those changes to the master table which have been made since the last refresh of the snapshot. A log file (referred to as a "master log") can be employed to track and record the rows that have been updated in the master table. When a snapshot is refreshed, only the appropriate rows in the master log need to be applied to the snapshot table. In a networked environment, only those modified rows found at the master site are transferred across the network and updated or inserted into the snapshot. Rows deleted in the master table are also deleted in the snapshot. Fast refresh is typically faster, more efficient, and involves less network traffic than a complete refresh.
For example, a company with 100 sales associates maintains three tables, shown in FIG. 2(a) to keep track of its sales information. Customer table 204 is a table that contains information about the company's customers. One of the columns in the table is a customer identifier (CID), which is a primary key, uniquely assigned to each customer. In the figures, primary key columns are indicated by an asterisk (*). Another column in table 204 is ZIP, which stores the zip code of the customer. Other columns, not shown, include street addresses, telephone numbers, and so forth. Order table 202 contains information about a customer's order. Each row in the order table has an order identifier (OID) and a CID. Each customer may place several orders, so there is a many-to-one relationship between the CID column of customer table 204 and the CID column of order table 202. In the figures, the many-to-one relationship is designated by a line connecting two tables with an inverted "V" on the "many" side. Order line table 200 stores an order line identifier (OLID) and an OID. Since each customer can have several orders, and each order several order lines, order line table 200 is usually much larger than customer table 204.
In this example, sales associates are assigned their territories by zip code, and they would like to keep a copy of only the relevant sales information on their laptop computers. Thus, if sales associate Smith is assigned only to zip codes 19555 and above, then Smith is only interested in the customers, orders, and order lines for zip code 19555 and above. Consequently, Smith creates a snapshot with a snapshot definition query for each table to retrieve that information in the snapshot. Since a snapshot also stores administrative information, the database system presents to Smith a snapshot view, which hides that administrative information. In FIG. 2(b), customer snapshot view 214 is the result of creating the snapshot with the following SQL query: