This invention relates to data storage in a computerized network or system. More particularly, the present invention relates to a new and improved technique of host-initiated synchronization of data that is stored on both a local storage device and a remote mirroring fail-over storage device. The data stored by the host on the local storage device is mirrored to the remote storage device, and a synchronization procedure enables the host and remote storage device easily and quickly to xe2x80x9croll backxe2x80x9d to, and continue operations from, a stable, coherent state in the event of a failure of the local storage device.
Computerized systems are commonly used to operate various businesses or enterprises. In many cases, the data that is kept on the computers and data storage devices is critical to the functioning of the enterprise. A temporary inability to access this data can halt business operations, and a total loss or corruption of the data can severely cripple the entire enterprise. Therefore, it is important to such enterprises to maintain availability and validity of the data.
One technique to ensure data availability and validity is to store the data in more than one storage device, such as in primary and secondary storage devices. In this case, the secondary storage device maintains a xe2x80x9cmirrored,xe2x80x9d or duplicate, copy of the data. In the event of a failure of the primary storage device, operations can resume using the secondary storage device and the mirrored data.
Additionally, the secondary storage device is typically maintained at a geographically remote location from the primary storage device, such as at a different city or state, while the primary storage device is kept locally. In this manner, a geographical disturbance, such as a local citywide power outage, will not affect both storage devices, and operations can eventually resume.
Also, the local and remote storage devices are typically accessed by host devices, or storage servers, that serve the data storage requirements of various client devices. At least one such host device is maintained at the local site and another at the remote location to access the local and remote storage devices, respectively. Therefore, when the local storage device fails, the remote host device, using the remote storage device, takes over serving the data storage requirements of the various clients.
Various methods have been developed to mirror, or duplicate, the data from the primary storage device at the local site to the alternate, secondary storage device at the remote site. Such remote mirroring solutions ensure the continuance of business in the event of a geographical disaster. Many of these solutions, however, have either performance or coherency synchronization issues. Performance issues require that very little time be taken to perform a xe2x80x9cfail-overxe2x80x9d to, or switch to, the remote storage and host devices, so as not to degrade the overall performance of the clients using the backed-up data. Coherency synchronization requires that the state of the stored data between the local and remote storage devices, be put in a xe2x80x9ccoherent statexe2x80x9d at which it is assured that both have correct, up-to-date data that may be used by a file system or database. In the event of a fail-over situation, the synchronization difficulties of current mirroring techniques can result in time-consuming special efforts to generate a coherent state in the remote storage device through file system check and recovery procedures, so that applications executing on the various clients can proceed to operate.
It is with respect to these and other background considerations that the present invention has evolved.
The present invention enables efficient remote data mirroring and xe2x80x9cfail-overxe2x80x9d capabilities in a computer system wherein a local host device stores data on a local storage device on behalf of various client devices, and mirrors the data storage on a remote storage device. xe2x80x9cFail-overxe2x80x9d refers to a situation in which the local storage device can no longer service data access requests, so the client devices must switch to using remote storage device with a remote host device for data backup processing. The local host device periodically initiates data synchronization procedures for the local and remote storage devices. Information regarding the latest synchronization procedures is maintained within the local host, local storage and remote storage devices. The synchronization information defines a common, known, coherent state of stored data for all of these devices. The time at which a data synchronization occurs is called a xe2x80x9ccheckpoint,xe2x80x9d and the condition of the stored data at which the coherent state is defined is called the xe2x80x9ccheckpoint state.xe2x80x9d
The remote storage device maintains a xe2x80x9csnapshotxe2x80x9d of the data at the latest checkpoint state. The snapshot is essentially a copy of a portion of the data as the data existed at the last checkpoint state. Changes to the stored data on the remote storage device are accepted after each previously occurring checkpoint, but the data that was present at the last checkpoint is transferred to and preserved in the snapshot, so it can be restored at a later time if necessary.
Since the remote storage device maintains information describing the checkpoint state, in the event of a fail-over condition, the remote host device quickly and easily xe2x80x9crolls backxe2x80x9d the state of the data stored on the remote storage device to the last common checkpoint state. The data is restored from the snapshot. Applications executing on the client devices, thus, restart at the restored checkpoint state with a minimum of interruption.
These and other improvements are achieved by storing and synchronizing data between a host device, a primary storage device and a secondary storage device. The host device stores data on the primary storage device on behalf of client devices. The data stored on the primary storage device is mirrored to the secondary storage device. Data synchronization between the host device and the primary storage device is initiated by the host device. A checkpoint message is issued from the host device to the primary storage device. The checkpoint message indicates that a storage state of the host device is at a stable consistent state. Data synchronization between the primary and secondary storage devices is performed by the primary storage device. The checkpoint message is then forwarded from the primary storage device to the secondary storage device. An incremental snapshot of the mirrored data is generated on the secondary storage device at the predetermined checkpoint indicated by the checkpoint message. The incremental snapshot includes data and information describing the mirrored data at the predetermined checkpoint to preserve a storage state of the secondary storage device at the predetermined checkpoint.
It is preferable that data be sent from the host device to the primary storage device and forwarded to the secondary storage device, so both the primary and secondary storage devices can update their storage state to be consistent with the host device. It is further preferable, when new data is sent from the host device to the primary storage device and then to the secondary storage device after the predetermined checkpoint, that the secondary storage device transfer any preexisting data, if it is replaced by the new data, to the incremental snapshot. Thus, the incremental snapshot maintains the storage state of the secondary storage device at the predetermined checkpoint.
It is also preferable that these steps be performed in conjunction with failing-over from utilization of the first host device and the primary storage device to utilization of a second host device and the secondary storage device. In such a fail-over situation, a failure of the first host device and/or the primary storage device is detected, and the second host device and the secondary storage device are signaled that they are to be utilized for primary data storage. An image of the data stored on the secondary storage device is assembled from the most recent incremental snapshot, and the second host device is informed when the image is complete, so the second host device and secondary storage device are ready to serve as primary data storage.
The previously mentioned and other improvements are also achieved by switching a client device from utilizing a first host device and a primary storage device to utilizing a second host device and a secondary (mirrored) storage device for primary data storage, upon failure of the first host device and/or the primary storage device. The failure of the first host device and/or the primary storage device is detected. The second host device is signaled that it is to be used for primary data storage. The secondary storage device is signaled to restore the mirrored data stored thereon to a preexisting common stable state that was established at a data synchronization checkpoint at which data was synchronized between the first host device, the primary storage device and the secondary storage device. An image of the mirrored data at the preexisting common stable state is assembled. The host device is signaled that the data image is complete, so the second host device and the secondary storage device are ready to serve as primary data storage for the client device.
The secondary storage device preferably includes a data volume storage area and an old-data storage area. The old-data storage area preferably stores preexisting data that was stored in the data volume storage area at the preexisting common stable state, but that was replaced in the data volume storage area by new data. In this case, it is preferable that the data image is assembled from the preexisting data and checkpoint synchronization information issued by the first host device. It is further preferable to restore the secondary storage device to the preexisting common stable state by returning the preexisting data to the data volume storage area.
The previously mentioned and other improvements are also achieved in a mirrored storage computer system for servicing data storage requirements of software applications executing on client devices. The mirrored storage computer system comprises a host device, a primary storage device, and a secondary storage device. The host device services the software applications requiring data storage, stores data externally, issues external storage access requests, and initiates periodic external data synchronization at stable storage states. The data synchronizations at stable storage states are referred to as data synchronization checkpoints, wherein data stored on the host device is made coherent with externally stored data with respect to a file system synchronization point. The primary storage device is connected to the host device to serve as the external data storage, stores data received from the host device responds to the storage access requests from the host device, makes the data stored in the primary storage device coherent with the data stored on the host device at the data synchronization checkpoints and forwards the data and the data synchronization checkpoints to the secondary storage device. The secondary storage device is connected to the primary storage device for secondary (mirrored) external data storage, receives the data and the data synchronization checkpoints, stores the data, makes the data stored in the secondary storage device coherent with the data stored on the host device and the primary storage device at the data synchronization checkpoints and generates a snapshot of the stored data upon receiving the data synchronization checkpoints. The snapshot represents the stable storage state in the secondary storage device at the data synchronization checkpoint.
The mirrored storage system preferably further comprises a second host device. The second host device is preferably connected to the secondary storage device, takes over servicing the software applications requiring the data storage, and externally stores data on the secondary storage device by issuing external data access requests to the secondary storage device. Preferably, the second host device takes over the servicing of the software applications upon failure of the first host device and/or the primary data storage device by utilizing the data stored on the secondary storage device. The secondary storage device also preferably stores the data received from the second host device and responds to the storage access requests from the second host device. The second host device preferably sends a restore signal to the secondary storage device instructing the secondary storage device to restore the data stored thereon to the stable storage state upon the failure of the first host device and/or the primary storage device. Upon receipt of the restore signal, the secondary storage device preferably restores its data to the stable storage state from the data synchronization checkpoint and the snapshot.