With the popularity and convenience of networking computer systems, data sharing among users through databases has become common in many business environments. Providing central access to information via databases requires careful consideration of database maintenance and management. Further, recovery techniques are essential in ensuring database coherence following a hardware/device failure or application logic failure.
In general, recovery techniques associated with database recovery reset a system or data stored in a system to an operable state following damage and provide a process of rebuilding databases by restoring a backup copy and rolling forward the logs associated with the backup copy. Included in the techniques are static recovery point, incremental recovery point, and continuous recovery point. Time stamp recovery is another recovery technique typically performed for application logic failures, rather than device failures. While providing disaster recovery, these techniques unfortunately do not provide some preferred results.
In static recovery point, a straightforward backup methodology is utilized. A user quiesces/halts all of the database activity and then image copies all of the databases to be sent off-site with a back-up of a RECON (recovery control) data set. The disaster recovery process involves performing moderate RECON cleanup operations, recovering the databases using standard database recovery control (DBRC) supported commands, and resumption of database processing. While being low cost and simple, several drawbacks exist in the static recovery point technique. Firstly, the technique requires a data outage, i.e., all database access must be terminated when establishing the static recovery point. The outage could vary from tens of minutes to hours (if image copies are taken). Further, since the impact of data outages results in infrequent static recovery points, there is a maximum data loss, which may lead to up to a day's worth of work or more of data updates being lost in the event of a disaster. Additionally, moderate RECON clean-up operations are required, for example marking primary logs and image copies in error so that secondary copies would be selected by DBRC during recovery operations, as is standardly known.
In incremental recovery point, image copies, logs, and periodic RECON backups are sent off-site as they become available with the actual transport varying as needed for a particular enterprise. The disaster recovery process includes determining the latest available logs and RECON backup, performing major RECON cleanup operations, recovering the databases using standard DBRC supported techniques, identifying and performing any needed database backout operations, and resuming processing. Advantageously, no data outage occurs for the incremental recovery point technique, and the technique is relatively low cost. However, there is medium data loss, with a minimum amount of data loss up to an OLDS' (on-line log data set) worth of data, e.g. up to hours worth of data updates, since the online log data is unavailable until it is archived. Further, moderate complexity during disaster recovery is needed due to the database recovery operations followed by database backout operations, which are generally considered complicated, error prone operations. Additionally, incremental recovery does not work well with data sharing environment's, such as IBM's IMS (information management system), since multiple, independent IMS log data are produced in an IMS data sharing environment, and there are no IMS utilities to handle such log data streams.
As its name implies, the continuous recovery point technique continually sends log data and RECON data off-site, (i.e., electronic log vaulting) with image copies sent off-site as they occur. For example, IBM's IMS/ESA Remote Site Recovery (RSR) feature environment is a continuous recovery point technique. The disaster recovery process involves performing an RSR takeover, recovering the databases using standard DBRC supported techniques, and performing any required backout operations. The continuous recovery technique avoids some of the problems mentioned for other recovery techniques by providing minimal data loss, working with IMS data sharing, and potentially reducing disaster recovery outage to single digit minutes with the RSR user option of shadow databases continuously maintained off-site. Unfortunately, while achieving some benefits, significant resource expense is required in order to provide continuous recovery point recovery. In addition to the costs for communications facilities between the primary data processing location and the off-site facility, there must also be an IMS tracking system executing at the off-site location to continuously receive the log data. Further, if shadow databases are utilized, additional cost is incurred due to the necessity of dedicated DASD at the off-site facility.
For time stamp recovery, recovery of a database occurs to some earlier state, for example, from the state at 4:00 PM to the state it was in at 2:00 PM. Of course, all updates between the current state and the state to which the database is recovered are lost, thus making time stamp recovery preferably avoided if possible. When time stamp recovery is the only viable option, however, the normal IMS rules dictate that the time stamp selected must be a time when update activity against the database was quiesced, i.e., to a recovery point, normally established by issuing database recovery commands (e.g., /DBR or/DBD commands) from all of the IMS subsystems currently accessing the database. Once the commands are successfully completed on all of the IMS subsystems, the databases can usually be restarted. The span of the resulting recovery point is from the completion of the last recovery (/DBR or /DBD) command to the issuance of the first start (e.g., /STA) command. For databases participating in data sharing, ai OLDS switch has to occur in-between these commands on each IMS subsystem. The main disadvantages of the time stamp recovery technique are that the creation of recovery points results in a temporary data outage, and that a recovery point has to exist prior to the need for one, which usually results in not having one when needed.
Accordingly, several needs exist for a method and system that overcomes the disadvantages of typical database recovery techniques. A further need exists in allowing an IMS database to be recovered to a state that is consistent with any associated relational database (e.g., DB2) tables.