The present invention relates to a technique for committing processing executed by a certain information processing system to another information processing system or a program or object for executing the processing at the time of occurrence of a disaster or predetermined condition in the certain information processing system or in accordance with a request. Particularly it relates to a database management system.
In a conventional database management system, a storage region (hereinafter referred to as “DB buffer”) is reserved on a memory of a computer (hereinafter referred to as “database server” or “DB server”) executing a database management program so that update data (hereinafter referred to as “DB data”) are temporarily stored in the DB buffer to increase the speed of rewriting of the DB data (table space) into a database by a transaction. The data written on the DB buffer are finally written in a storage system having a nonvolatile storage medium (hereinafter referred to as “storage system”). Generally, in the database server, the DB buffer is used as a temporary storage means because access time to the storage system is larger than access time to the memory.
The speed of data write/read (hereinafter generically referred to “I/O”) in the DB buffer is higher than the speed of I/O in the storage system. The memory of the database server is however generally volatile. Data stored in the memory vanish at the time of power failure or server restart. In addition, data may vanish at the time of hardware failure in the database server. The database management system generates and manages a log (journal) to prevent the contents of the transaction to be committed from vanishing in such a case.
Specifically, the database management system is sure to write the update contents of the DB data concerning a certain transaction as a log having a log sequence number (LSN) in a log storage region (hereinafter referred to as “logical disk”) of the storage system before the transaction is committed.
When DB data on the DB buffer are written in a DB data logical disk of the storage system by a checkpoint process, the database management system records information of the checkpoint process corresponding to the log sequence number as status information of the log.
In a process of restarting the database management system after occurrence of disaster in the database management system, the database management system writes update data of the transaction committed at the time of occurrence of the disaster in the DB data logical disk by using the log and cancels data update concerning a non-committed transaction. Because data update on the DB buffer is reflected on the DB data volume by the checkpoint process, the log used in this case is a portion recorded after the latest checkpoint. The log sequence number corresponding to the latest checkpoint is judged by referring to the status information.
The data recovery process based on the log has been described in detail in Jim Gray et al., “TRANSACTION PROCESSING; Concepts and Techniques”, pp. 556-557, 604-609.
Because the recovery process is carried out on the assumption that DB data volume and log data volume can be used at the time of restart of the data management system, it is impossible to use the recovery process when the storage system suffers from a disaster such as an earthquake, a fire or a terrorist act. As a technique provided for such a case, there is known a method in which log and DB data necessary for restarting the database management system are sent to a remote computer system not suffering from the disaster (hereinafter referred to as “recovery site”) in advance. Specifically, a remote copy technique is known.
Remote copy is a technique in which a computer system in operation of a database management system or the like (hereinafter referred to as “main site”) and a storage system on the recovery site are connected to each other by a communication line (hereinafter referred to as “link”) and in which a storage system on the main site (hereinafter referred to as “main storage system”) sends data to be written in the main storage system (hereinafter referred to as “write data”) to the remote site. Incidentally, as a modified example, there is also a technique in which a computer or switch connected to the main storage system sends write data to the recovery site.
Remote copy is classified into synchronous remote copy and asynchronous remote copy. In the synchronous remote copy, a process of sending data to the recovery site is synchronized with a write request process from a computer on the main site (hereinafter referred to as “host”), that is, write data are transferred to the recovery site before completion of a write request process and then a notice of completion of the write request process is sent to the recovery site. In the asynchronous remote copy, the two processes are carried out asynchronously, that is, data are transferred to the recovery site after a notice of completion of data write is sent to the recovery site when data write based on the write request process is completed. The remote copy technique has been disclosed in U.S. Pat. No. 5,640,561 and JP-A-11-85408. Particularly U.S. Pat. No. 5,640,561 has disclosed a technique of asynchronous remote copy for guaranteeing that a data update sequence in a storage system on the recovery site (hereinafter referred to as “sub storage system”) is made equal to a data update sequence from the host to the main storage system.
When the aforementioned synchronous remote copy technique is used, the process of restarting the database management system can be directly applied to the case where recovery from a disaster is carried out on the recovery site because the main site is disabled from continuing its transaction due to the disaster. That is, logical disks in which log and DB data and status information necessary for the restart process are stored respectively are transferred to the recovery site by synchronous remote copy. In the case of synchronous remote copy, the contents of data in the logical disks on the recovery site are the same as those on the main site. When an ordinary restart process is carried out by a database management system on the recovery site (hereinafter referred to as “standby database management system”), data can be recovered on the main site without missing of any committed transaction and without remaining of any updated non-committed transaction.
As described above, when synchronous remote copy is used, recovery from disaster can be made while the contents of the transaction are guaranteed. In the synchronous remote copy, performance of a database management system on the main site (hereinafter referred to as “active database management system”) however deteriorates because a write command response time of the host on the main site increases as the time required for back-and-forth motion of packets on the link increases in accordance with increase in distance between the main site and the recovery site and increase in delay of devices constituting the link between storage systems.
Asynchronous remote copy is a remote copy technique for suppressing the increase of the command response time. As described above, in the asynchronous remote copy, the main storage system sends a notice of completion of the write command to the host on the main site without waiting for the completion of sending write data to the sub storage system. As a result, increase in the write command response time on the main site can be suppressed.
The following two kinds of methods have been conventionally used for applying the synchronous/asynchronous remote copy to the process of restarting the standby database management system.
(1) Synchronous Log Data Transfer and Synchronous DB Data Transfer
This is a method in which both write data to be written in the log logical disk and write data to be written in the DB data logical disk are sent to the sub storage system by synchronous remote copy. Because synchronous remote copy is used, there is a guarantee that all write processes issued from the DB server to the main storage system and completed are reflected on the sub storage system. For this reason, the process of restarting the standby database management system can be made in the same procedure as used for the process of restarting the active database management system. Accordingly, there is no missing of the committed transaction on the main site. Performance of the active database management system however deteriorates when the distance between the main site and the recovery site increases or when the quantity of delay on the link increases. In this method, when the logical disk in which status information indicating the status of the log data is stored is different from the logical disk in which log or DB data are stored, write data for the status information logical disk are also sent to the sub storage system by synchronous remote copy.
(2) Asynchronous Log Data Transfer and Asynchronous DB Data Transfer
This is a method in which both write data for the log logical disk and write data for the DB data logical disk are sent to the sub storage system by asynchronous remote copy. Because asynchronous remote copy is used, the influence on the performance of the main site database management system due to increase in the quantity of delay on the link can be concealed easily. There is however a possibility that the latest transaction may vanish at the time of restart on the recovery site because there is no guarantee that all log data of the committed transaction on the main site will be reflected on the sub storage system.