Field of the Invention
The present invention relates generally to data processing environments and, more particularly, to a system providing methodology for hybrid data replication.
Background Art
Computers are very powerful tools for storing and providing access to vast amounts of information. Computer databases are a common mechanism for storing information on computer systems while providing easy access to users. A typical database is an organized collection of related information stored as “records” having “fields” of information. As an example, a database of employees may have a record for each employee where each record contains fields designating specifics about the employee, such as name, home address, salary, and the like.
Between the actual physical database itself (i.e., the data actually stored on a storage device) and the users of the system, a database management system or DBMS is typically provided as a software cushion or layer. In essence, the DBMS shields the database user from knowing or even caring about the underlying hardware-level details. Typically, all requests from users for access to the data are processed by the DBMS. For example, information may be added or removed from data files, information retrieved from or updated in such files, and so forth, all without user knowledge of the underlying system implementation. In this manner, the DBMS provides users with a conceptual view of the database that is removed from the hardware level. The general construction and operation of database management systems is well known in the art. See e.g., Date, C., “An Introduction to Database Systems, Seventh Edition”, Addison Wesley, 2000.
Increasingly, businesses run mission-critical systems which store information on database management systems. Each day more and more users base their business operations on mission-critical systems which store information on server-based database systems, such as Sybase® Adaptive Server® Enterprise (ASE) (available from Sybase, Inc. of Dublin, Calif.). As a result, the operations of the business are dependent upon the availability of data stored in their databases. Because of the mission-critical nature of these systems, users of these systems need to protect themselves against loss of the data due to software or hardware problems, disasters such as floods, earthquakes, or electrical power loss, or temporary unavailability of systems resulting from the need to perform system maintenance.
One well-known approach that is used to guard against loss of critical business data maintained in a given database (the “primary database”) is to maintain one or more standby or replicate databases. A replicate database is a duplicate or mirror copy of the primary database (or a subset of the primary database) that is maintained either locally at the same site as the primary database, or remotely at a different location than the primary database. The availability of a replicate copy of the primary database enables a user (e.g., a corporation or other business) to reconstruct a copy of the database in the event of the loss, destruction, or unavailability of the primary database.
Database replication technologies comprise a mechanism or tool for duplicating data from a primary source or “publisher” to one or more “subscribers”. The data may also be transformed during this process of replication.
In many cases, a primary database may publish items of data to a number of different subscribers. Also, in many cases, each of these subscribers is only interested in receiving a subset of the data maintained by the primary database. In this type of environment, each of the subscribers specifies particular types or items of data (“subscribed items”) that the subscriber wants to receive from the primary database.
In current replication environments, replication typically requires the replicate to be materialized before replication begins. Materialization refers to the process of copying data, specified by a subscriber, from a published primary database to a replicate database, thereby initializing the replicate table(s). Once materialized, replication may proceed immediately.
Depending on the needs of a given environment, continuous replication or snapshot replication may be performed following materialization. Continuous replication refers to log-based replication from a primary database to a replicate database and offers near real-time protection by capturing data completely via the log. Snapshot replication offers, a point-in-time copy, and thus, is considered mutually exclusive to continuous replication for a given primary and replicate database pair.
Such mutual exclusivity may compromise performance for certain environments that would benefit from being able to switch from one form of replication to another. For example, there may be situations, such as certain times of day, when it is known that limited activity would be occurring in a primary database, such that snapshot replication would be sufficient, while at other times, continuous would be preferable. Accordingly, a need exists for an approach to replication that avoids these limitations. The present invention addresses such a need.