1. Field of the Invention
This invention is related to data processing systems with disk array storage devices and more specifically to a method and apparatus that enhances recovery operations in such disk array storage devices.
2. Description of Related Art
A conventional data processing system that handles large quantities of data generally includes a host and a disk array storage device, or DASD. A host generally includes one or more control processors and a main memory, and it executes programs and operates on data transferred to the main memory from the disk array storage devices as known in the art. Disk array storage devices, such as those manufactured and sold by the assignee of this invention, include many physical storage devices organized in logical storage volumes or logical devices. Such a disk array storage device operates with a host adapter or equivalent module that receives an input/output command from the host over a channel in a host dependent format. The host adapter translates that input/output command into a format that disk adapters recognize and use to direct operations at a logical device level. When an operation completes in the disk array storage device, a status word returns to the host adapter to report either the success of the operation or the reason for a failure.
Significant efforts have been made to enhance the operation and performance of disk array storage devices in order to enhance the performance of an incorporating data processing system. One such effort has been directed to the enabling of ancillary disk array storage device operations with respect to main application programs. Particular emphasis has been placed upon enabling data backups without interrupting a main application program running on the host. For example, in a airlines reservation database application, it is obviously desirable to allow a database backup without interrupting any of the transactions underway with the various users on the system who are making or altering reservations.
U.S. Pat. No. 6,101,497 of Ofek for a Method and Apparatus for Independent and Simultaneous Access to a Common Data Set, assigned to the same assignee as this invention, discloses a concept for making such an improvement. In accordance with that disclosure, certain physical disk drives in a disk array storage device are configured to be available to an application. These are called xe2x80x9cstandard devicesxe2x80x9d. Other logical devices are configured to act either as a mirror for a standard logical device or to be split to provide a copy of the data on the standard device for some other purpose. In the context of the systems manufactured by the assignee of this invention, the second logical devices are called xe2x80x9cBCV devicesxe2x80x9d. Using the foregoing airline reservation systems as an example, the invention disclosed in U.S. Pat. No. 6,101,497 enables a BCV device to attach to a standard device thereby to act as a mirror. Generally speaking, anytime after the BCV device has achieved synchronism with the standard device, the BCV device can be split, or detached, from the standard device. The copy of the data on the BCV device is then available to other applications, such as a backup application. This allows the other application to act on the data on the BCV device independently of and simultaneously with the continued operation of the main application with data stored on the standard device.
As the use of such data processing systems has grown, grown certain issues that impact the splitting of a BCV device from its corresponding standard device have appeared. These include an issue of pending write data operations. Disk array storage devices of many manufacturers, including those of the assignee of this invention, utilize cache memory to enhance performance, particularly for write operations. When a host issues a write command, the data to be written transfers only to the cache memory before the operation is signaled to be complete back to the host. That data remains in the cache for some interval before that data, or overwritten data to the same location, transferst o the logical device itself. During that transient interval in the cache, the operation is complete with respect to the host, but pending with respect to physical disk device. The entry in the cache is labelled as being a xe2x80x9cwrite pendingxe2x80x9d entry. The process of transferring a xe2x80x9cwrite pendingxe2x80x9d entry to a logical device is called xe2x80x9cdestagingxe2x80x9d.
With BCV and like devices, some mechanism must manage write pending entries so that the BCV device, after it is split, accurately reflects the data on the standard device at the time of the split, updated by any write pending entries that were in the cache memory at the time that the split occurred.
In the system described in the foregoing reference, the BCV device stops acting as a mirror in response to a split command. Then the standard device with which the BCV device operates as a mirror is locked for an interval during which all write pending entries and previous write requests in the cache are managed. No write requests or other access to either the standard device or the BCV device can occur while the lock is in place. After the lock is acheived, a program module performs a number of functions on a track-by-track basis. If a write pending entry is associated with a track, the module immediately performs the necessary steps for processing that write pending entry. If the previous write operation has occurred and been destaged, the module also performs any updates to track invalid bits. After this process has been completed for all tracks in the logical volume, the lock is released. This process can be very time consuming, particularly if there are a large number of write pending entries at the time the split occurs. It was found that it was possible that the lock could be in place for seconds or even minutes under certain conditions and these delays were not acceptable in many applications.
U.S. Pat. No. 6,370,626 to Gagne et al. discloses a Method and Apparatus for Independent and Simultaneous Access to a Common Data Set that reduces this lock time by implementing an xe2x80x9cinstant splitxe2x80x9d operation. When an xe2x80x9cinstant splitxe2x80x9d command is received, the BCV device immediately detaches from the standard device and becomes accessible to an alternate application. This occurs under a lock condition that lasts in the order of microseconds during which certain control operations are accomplished but no data is transferred and no write pending entries are managed. Immediately thereafter the lock is released. Various processes in the disk array storage device thereafter manage the write pending entries in an orderly fashion even as the main application interacts with the standard device and the alternate application, such as a backup application, interacts with the BCV device.
The introduction of the instant split command overcame the unacceptable lock times of the original split command. However, applications continue to grow in complexity and the data associated with those applications continues to grow. Whereas an application and associated data may originally have been stored on a single standard device, such applications and associated data now may be stored on multiple standard devices. Some applications now require storage that exceeds the capacity of a single disk array storage device necessitating that the distribution of a single application over two or more disk array storage devices with hundreds of standard devices. In a database application, for example, one standard device may contain the database data while another standard device contains an the associated log file. In such multiple device applications it was possible to institute a multiple instant split operation by issuing a series of discrete instant split operations for all the BCV devices. These would then be processed.
However, each discrete instant split operation was dispatched separately, so the order in which the instant splits occurred on different BCV devices was unpredictable. Consequently it was possible for the application program to write to a logical device that was queued to be split, but for which the split had not yet issued. This could produce inconsistent data. For example, dependent write operations in database applications involve three write operations. The first write operation transfers an entry to a log file through the cache establishing the fact that data is to be written. The second write operation transfers the data to the cache for destaging to a standard device. The third write operation transfers another log entry to the cache for the log file; this entry indicates that the operation is complete. If a multiple instant split operation is conducted so that the instant split for the logical volume containing the data is completed first, the data file may be updated without updating the log file in the BCV devices. There would be no record of the data change in the split BCV devices. Alternatively if the log files were destaged and updated before the data file was updated, the log file could indicate the completion of an operation without the data actually having been transferred to its split BCV devices. Consequently in either event, the data in the split BCV device will be inconsistent.
It is difficult at best then to identify any such inconsistent data, particularly when dependent data transfers are involved. Consequently it becomes very difficult to recover data in the event of some type of malfunction. What is needed is a method and apparatus for enabling such instant split operations to occur such that related data on multiple split BCV devices or the like is consistent thereby to prevent any data corruption.
Therefore it is an object of this invention to provide a method and apparatus for enhancing recovery operations in disk array storage devices.
Another object of this invention to provide a method and apparatus for enabling a group of related logical devices operating as mirrors to be split for operation with other applications while maintaining data consistency.
In accordance with one embodiment, this invention is implemented in a data processing system including a host and at least one disk array storage device including a plurality of first logical devices for interacting with a first application and including a second logical device corresponding to each first logical device that can interact in a first mode as a mirror for a corresponding first logical device and that can interact in a second mode with a second application. A command is issued to shift a plurality of identified second logical devices from the first mode to the second mode in a consistent fashion. In response a request data structure is generated for each of the second logical devices identified in the command. Then all interactions between the first application and the first logical devices corresponding to the identified second logical devices are disabled. Next a shift of all of the identified second logical devices to the second mode after said disablement. Upon completion of the shift interactions between the first application and the standard logical devices corresponding to the identified second devices resume and a return to the host is generated to indicate that the shift to the second mode has been completed whereupon the data in the identified second logical devices is consistent.