The present invention relates generally to computer storage systems, and more particularly to remote mirroring in distributed computer storage systems.
In a common computer system architecture, a host computer is coupled to a computer storage system that provides non-volatile storage for the host computer. The computer storage system includes, among other things, a number of interconnected storage units. Each storage unit includes a number of physical or logical storage media (for example, a disk array). For convenience, a group of one or more physical disks that are logically connected to form a single virtual disk is referred to hereinafter as a xe2x80x9cLogical Unitxe2x80x9d (LU). Data from the host computed is stored in the computer storage system, and specifically in the various storage units within the computer storage system.
One problem in a computer storage system is data loss or unavailability, for example, caused by maintenance, repair, or outright failure of one or more storage units. In order to prevent such data loss or unavailability, a copy of the host data is often stored in multiple storage units that are operated at physically separate storage units. For convenience, the practice of storing multiple copies of the host data in physically separate storage units is referred to as xe2x80x9cremote mirroring.xe2x80x9d Remote mirroring permits the host data to be readily retrieved from one of the storage units when the host data at another storage unit is unavailable or destroyed.
Therefore, in order to reduce the possibility of data loss or unavailability in a computer storage system, a xe2x80x9cremote mirrorxe2x80x9d (or simply a xe2x80x9cmirrorxe2x80x9d) is established to manage multiple images. Each image consists of one or more LUs, which are referred to hereinafter collectively as a xe2x80x9cLU Array Set.xe2x80x9d It should be noted that the computer storage system may maintain multiple mirrors simultaneously, where each mirror manages a different set of images.
Within a particular mirror, one image is designated as a master image, while each other image within the mirror is designated as a slave image. For convenience, the storage unit that maintains the master image is referred to hereinafter as the xe2x80x9cmaster storage unit,xe2x80x9d while a storage unit that maintains a slave image is referred to hereinafter as a xe2x80x9cslave storage unit.xe2x80x9d It should be noted that a storage unit that supports multiple mirrors may operate as the master storage unit for one mirror and the slave storage unit for another mirror.
In order for a mirror to provide data availability such that the host data can be readily retrieved from one of the slave storage units when the host data at the master storage unit is unavailable or destroyed, it is imperative that all of the slave images be synchronized with the master image such that all of the slave images contain the same information as the master image. Synchronization of the slave images is coordinated by the master storage unit.
Under normal operating conditions, the host writes host data to the master storage unit, which stores the host data in the master image and also coordinates all data storage operations for writing a copy of the host data to each slave storage unit in the mirror and verifying that each slave storage unit receives and stores the host data in its slave image. The data storage operations for writing the copy of the host data to each slave storage unit in the mirror can be handled in either a synchronous manner or an asynchronous manner. In synchronous remote mirroring, the master storage unit ensures that the host data has been successfully written to all slave storage units in the mirror before sending an acknowledgment to the host, which results in relatively high latency, but ensures that all slaves are updated before informing the host that the write operation is complete. In asynchronous remote mirroring, the master storage unit sends an acknowledgment message to the host before ensuring that the host data has been successfully written to all slave storage units in the mirror, which results in relatively low latency, but does not ensure that all slaves are updated before informing the host that the write operation is complete.
In both synchronous and asynchronous remote mirroring, it is possible for the master storage unit to fail sometime between receiving a write request from the host and updating the master image and all of the slave-images. The master storage unit may fail, for example, due to an actual hardware or software failure in the master storage unit or an unexpected power failure. If the master storage unit was in the process of completing one or more write operations at the time of the failure, the master storage unit may not have updated any image, may have updated the master image but no slave image, may have updated the master image and some of the slave images, or may have updated the master image and all of the slave images for any particular write operation. Furthermore, the master storage unit may or may not have acknowledged a particular write request prior to the failure.
After the failure, it may not be possible for the master storage unit to determine the status of each slave image, and specifically whether a particular slave image matches the master image. Therefore, the master storage unit typically resynchronizes all of the slave images by copying the master image block-by-block to each of the slave storage units. This synchronizes the slave images to the master image, but does not guarantee that a particular write request was completed. Unfortunately, copying the entire master image to all slave storage units can take a significant amount of time depending on the image size, the number of slave storage units, and other factors. It is not uncommon for such a resynchronization to take hours to complete, especially for very large images.
Thus, there is a need for a system, device, and method for quickly resynchronizing slave images following a failure.
In accordance with one aspect of the present invention, a master storage unit utilizes a write intent log to quickly resynchronize slave images following a failure in the master storage unit. The write intent log is preserved through the failure, such that the write intent log is available to the master storage unit upon recovery from the failure. The write intent log identifies any portions of the slave images that may be unsynchronized from the master image. The master storage unit resynchronizes only those portions of the slave images that may be unsynchronized as indicated in the write intent log.
In a preferred embodiment, the write intent log identifies any image blocks that may be unsynchronized. In order to resynchronize the slave images, the master storage unit copies only those image blocks indicated in the write intent log from the master image to the slave images.
By resynchronizing only those portions of the slave images that may be unsynchronized, the master storage unit is able to resynchronize the slave images in significantly less time (perhaps seconds rather than hours) than it would have taken to copy the entire master image block-by-block to each of the slave storage units.