A database may be defined as a structured collection of records or data that is stored in a computer so that a program can consult it to answer queries. The records retrieved in answer to queries become information that may be used to make decisions. The computer program used to manage and query a database may be referred to as a database management system.
Typically, for a given database, there is a structural description of the type of facts held in the database. The structural description may be referred to as a “schema.” The schema describes the objects that are represented in the database, and the relationships among them. There are a number of different ways of organizing a schema, that is, of modeling the database structure. These are known as database models or data models. One such model is referred to as the “relational model.” The relational database model represents all information in the form of multiple related tables each consisting of rows and columns. The relational database model represents relationships by the use of values common to more than one table.
As discussed above, a database, such as a relational database, may include tables each consisting of rows and columns. One or more of these tables may be referred to as a “source table(s).” A source table refers to the table that includes the latest data populated by an external data source. For example, an application taking online orders may be populating the source table with order information (e.g., the product being purchased, the credit card number of the buyer, etc).
Further, one or more of the tables in the database, such as a relational database, may be referred to as a “target table(s).” A target table may refer to a table that serves as a back-up table to the source table. Hence, there is an effort in ensuring that the data in the target table is current and consistent with the data in the source table. The process in ensuring that the data in the target table is current and consistent with the data in the source table may be referred to as “synchronization.”
Synchronization may occur by having an application of the database management system reading a log which indicates the changes to the source data (data in the source table) and then mirroring those changes into the target table. However, such applications are not platform agnostic. That is, multiple versions of these applications have to be created in order to be used on different platforms. For example, one could not use an IBM™ DB2™ application to ensure that changes to the source data are mirrored into a target table with an Oracle™ implementation.
Synchronization may also occur by performing Structured Query Language (SQL) queries for retrieving the data in the entire source table and then updating the entire target table with the retrieved data. Synchronization in this manner is inefficient as a lot of time is spent reading the data from the source table that has already been updated in the target table. Further, synchronization in this manner is inefficient as data is unnecessarily moved from the source table to the target table where this data is already stored in the target table thereby unnecessarily increasing the use of the network capacity. Further, there is a problem, referred to herein as “contention,” where an external data source (e.g., online order application) attempts to access the source table at the same time as data is being copied from the source table. Since the entire source table is being copied, the external data source may have to wait a significant period of time before the external data source has access to the source table.
Therefore, there is a need in the art to synchronize the relational source and target tables in an efficient manner using an application that is platform agnostic while minimizing the contention at the source table.