In large electronic-commerce websites, it is often required to synchronize the data in a backend data warehouse with a frontend database, such as an ORACLE™ database for Online Transaction Processing (OLTP) for the purpose of processing queries by a frontend application server.
The ORACLE™ database includes an undo/rollback table space. Raw data that affects data lines is stored in the undo/rollback table space when a session performs Update and Delete operations in a Data Manipulation Language (DML) on the database. When another session accesses these data lines before the session in question has been committed, the session will receive indication that the data has been modified but not yet committed and thus will retrieve the raw data from the undo/rollback table space in accordance with stored undo/rollback address information. The occupied rollback segment will not be released until the session in question is committed or until the rollback is finished.
FIG. 1 is a diagram illustrating a typical data synchronization technique. In this example, table t1 is a table residing in a frontend OLTP ORACLE™ database, the Data Warehouse (DW) is a backend data warehouse, and the Application server retrieves data from table t1. A typical data synchronization technique is as follows (all the following commands are executed at the frontend database): in the first step, data in a user table t1 is deleted using a command such as delete from schema1.tablename and the deleted data is temporarily stored in an undo/rollback table space; in the second step, data is retrieved from a user table t2 of the backend data warehouse and inserted into the user table t1 using a command such as insert into schema1.tablename select * from schema2.tablename@lnk_dw and synchronized but uncommitted data is also temporarily stored in the undo/rollback table space; in the third step, the data is committed; in the fourth step, data is rolled back if an abnormality occurs. Since the ORACLE™ database is adopted for both the frontend and the data warehouse, the data can be imported/exported directly via a database link. For example, Ink_dw used in the second step of the above steps represents a database link.
In the above data synchronization technique, all operations of the same data service are performed on the same table, that is, a table in the backend data warehouse corresponds to a table in the frontend database. As such, a large amount of data has to be stored in the undo/rollback table space during operation. The following drawbacks exist in the existing technique:
(1) In the above steps, a significant portion of the undo/rollback table space is used in the first and second steps, which tends to cause problems. If, for example, the undo/rollback table space is exhausted, then other normal DML operations such as “Insert”, “Update” and “Delete invoked during the normal course of a service may be affected. If the undo/rollback table space is fully occupied in a peak period, then respective DML statements sent from the application server to the database will have to wait, thus causing the connection pool at the server to be fully occupied. Consequently, an increasing number of incoming requests to the application server will have to wait for connections and a vicious cycle follows, wherein resources such as memory on the application server becomes exhausted, causing the application server to stop functioning normally.
(2) The execution of the first and second steps can take a long time. During this time, if any abnormality occurs, the fourth rollback step will be executed. The resulting system has poor error tolerance. When there is a large amount of data to be rolled back, the rollback operation can take a long period to perform, thus degrading the performance of a production database.
(3) Data synchronization is inefficient, primarily because of the deletion operation of the first step.
(4) It is difficult to resume a broken transport.
(5) Data synchronization progress cannot be easily monitored.
As the amount of data to be synchronized increases, the undo/rollback table space of the frontend ORACLE™ database is likely to be fully occupied, causing the performance of the system to be negatively impacted.