1. Field of the Invention
The present invention relates to data backup systems. More particularly, the invention concerns a database management system that ensures consistency between primary and mirrored backup copies of a database, despite the occurrence of a suspending condition that interrupts the normal process of backing up the primary database.
2. Description of the Related Art
Database management systems ("DBMSs") not only store large amounts of data, but they also facilitate the efficient access, modification, and restoration of this data. Data is typically stored using several different types of media, in order to provide efficient and cost effective data storage. Each type of media has certain features appropriate for the storage of certain types of data.
One type of data storage is electronic memory, usually dynamic or static random access memory ("DRAM" or "SRAM"). Electronic memories take the form of semiconductor integrated circuits storing millions of bytes of data. Access to such data occurs in a manner of nanoseconds. The electronic memory provides the fastest access to data since access is entirely electronic.
A second level of data storage usually involves direct access storage devices ("DASDs"). DASD storage usually involves magnetic and/or optical disks, which store bits of data as micrometer-sized magnetically or optically altered spots on a disk surface representing the binary "ones" and "zeros" that make up those bits of data. Magnetic DASD storage utilizes one or more disks coated with a remnant material. The disks are rotatably mounted within a protected environment. Each disk is divided into many concentric tracks, with the data being stored serially bit by bit, along each track. An access mechanism known as a head disk assembly ("HDA"), typically includes one or more read/write heads, and is provided in each DASD for moving across the tracks to transfer the data to and from the surface of the disks as the disks rotate past the read/write heads. DASDs can store gigabytes of data with access to such data typically being measured in milliseconds. Access to data storage in DASD is slower than electronic memory, since the disk and HDA must be physically positioned to access a desired data storage location.
Another type of data storage is a data storage library. In comparison to electronic memory and DASD storage, access to desired data in a library is not as fast since a robot is needed to select and load a data storage medium containing the desired data. Data storage libraries, however, provide significantly reduced cost for very large storage capabilities, such as terabytes of data storage. Data storage libraries often utilize tape media, for safe keeping of data stored on other media such as DASD or electronic memory. Access to data stored in today's libraries is usually measured in seconds.
Having a backup data copy is mandatory for many business that cannot tolerate data loss. Some examples include stock brokers, businesses with internationally accessible data, telephone companies, and the like. Simply having the backup data available is sometimes not enough, though. It is also important to be able to quickly recover lost data. In this respect, the "dual copy" operation provides a significant improvement in speed over tape or library backup. One example of the dual copy operation involves providing "secondary" DASDs to closely mirror the contents of one or more primary DASDs. If the primary DASDs fail, the secondary DASDs can provide the necessary data From the user's perspective, one drawback to this approach is that it effectively doubles the number of DASDs required in the storage system, thereby increasing the costs of the system.
Another data backup procedure is the "remote dual copy" operation. With remote dual copy, data is continuously backed up at a site remote from the primary data storage. This backup may occur synchronously or asynchronously. A substantial amount of control data is required to realize this process, however, in order to communicate duplexed data from one host processor to another host processor, or from one storage controller to another storage controller, or some combination thereof. Unfortunately, overhead necessitated by the required control data which can interfere with a secondary site's ability to keep up with its primary site's processing, threatening the ability of the secondary site to recover the primary site's data if needed.
With both dual copy and remote dual copy, a primary DASD volume and secondary DASD volume form a duplex pair. Copying in the DASD subsystems is controlled by I/O commands to the copied volumes. Such I/O commands provide a device-by-device control for establishing or suspending duplex pairs, or queuing the status of a duplex pair. Device-by-device control, however, is not adequate for all disaster recovery applications. The copied data at the secondary location is usable only so long as that copied data is time-consistent with the original data. Typically, consistency is ensured by stopping the system while copying data, thus preventing further updates to the data. An improvement to this method is known as "T0" or "concurrent copy". Concurrent copy reduces the time needed to halt the system, but suspension is still required.
Another technique, "real time dual copy", ensures time consistency across the secondary volume. Examples of real time dual copy include extended remote copy ("XRC") and peer to peer remote copy ("PPRC"). Even with real time dual copy, however, primary system suspension is necessary for device-by-device control. The suspensions can cause undesirable disruptions in some systems, however. XRC systems provide a partial solution by using a software-controlled data mover, where a single command stops a session such that secondary devices are time consistent due in part to the asynchronous nature of the copy operation.
Thus, as shown above, it is difficult to accurately maintain two mirrored data bases that are entirely consistent with each other. Known backup solutions either temporarily halt storage to ensure consistency between primary and secondary systems, or simply tolerate a brief lag in updating the secondary system. Tens of thousands of I/O operations can occur in a single second. Therefore, even a short delay in updating data in the secondary system constitutes a significant lag in consistency. Any one of these many transactions could be a million dollar bond purchase. Consequently, even though many of the known backup solutions constitute significant advances and enjoy widespread commercial success today, International Business Machines Corp. is continually seeking to improve the performance and the efficiency of the systems to benefit its customers.