Under certain conditions, it is desirable to maintain copies of a particular body of data, such as a relational database table, at multiple sites. The mechanism for maintaining multiple copies of the same body of data at multiple sites is generally referred to as “data replication.” In a distributed database system using data replication, multiple replicas of data exist in more than one database in the distributed database system.
One kind of data replication employs snapshots. A snapshot is a body of data constructed of data from one or more “master” tables, views, or even other snapshots, any of which can be stored locally or remotely relative to the snapshot. The data contained within the snapshot is defined by a query that references one or more master tables (and/or other database objects) and reflects the state of its master tables at a particular point in time. To bring the snapshot up-to-date with respect to the master tables, the snapshot is refreshed upon request, e.g. at a user's command or automatically on a periodic, scheduled basis.
There are two basic approaches for refreshing a snapshot. “Complete refreshing” involves reissuing the defining query for the snapshot and replacing the previous snapshot with the results of the reissued query. “Incremental refresh” or “fast refresh” refers to identifying the changes that have happened to the master tables (typically, by examining a log file of the changes) and transferring only the data for the rows in the snapshot that have been affected by the master table changes. An “updatable snapshot” is a snapshot to which updates may be directly made, which are propagated from the snapshot back to the master table before refreshing.
Traditionally, snapshots have been implemented for high-end computer systems, which are characterized by the use of high performance computers that are interconnected to one another by highly reliable and high bandwidth network links. Typically, highly experienced database administrators manage these high-end systems. Due to the expense of these high-end computers, high-end distributed systems tend to involve a small number of networked sites, whose users can be trusted at least in part because of the physical security of the computers.
Recently, there has been much interest in the marketplace for applications for front office automation. One example is sales force automation, where hundreds, if not thousands, of sales representatives in a company are given laptops to improve their productivity. The laptops are loaded with applications, for example, to help a sales representative sell the company's products to a customer and take the customer's order. Therefore, the laptops include a data store to keep the customer and order information handy for use by a specific sales representative.
Front office automation, however, challenges the operating assumptions behind the high-end snapshot implementations. For example, high-end snapshot replication uses a full relational database system at each site to drive the snapshot refreshes, receive the row data in a SQL format, and apply the changes. Since laptops are computationally constrained, it is desirable to implement thin clients, for example JAVA™ applications, rather than requiring a full relational database system. However, the high-end row transfer mechanism employs a thick, SQL application programming interface (API).